Skip to main content
Top
Published in: BMC Public Health 1/2019

Open Access 01-12-2019 | Public Health | Research article

Machine learning to refine decision making within a syndromic surveillance service

Authors: I. R. Lake, F. J. Colón-González, G. C. Barker, R. A. Morbey, G. E. Smith, A. J. Elliot

Published in: BMC Public Health | Issue 1/2019

Login to get access

Abstract

Background

Worldwide, syndromic surveillance is increasingly used for improved and timely situational awareness and early identification of public health threats. Syndromic data streams are fed into detection algorithms, which produce statistical alarms highlighting potential activity of public health importance. All alarms must be assessed to confirm whether they are of public health importance. In England, approximately 100 alarms are generated daily and, although their analysis is formalised through a risk assessment process, the process requires notable time, training, and maintenance of an expertise base to determine which alarms are of public health importance. The process is made more complicated by the observation that only 0.1% of statistical alarms are deemed to be of public health importance. Therefore, the aims of this study were to evaluate machine learning as a tool for computer-assisted human decision-making when assessing statistical alarms.

Methods

A record of the risk assessment process was obtained from Public Health England for all 67,505 statistical alarms between August 2013 and October 2015. This record contained information on the characteristics of the alarm (e.g. size, location). We used three Bayesian classifiers- naïve Bayes, tree-augmented naïve Bayes and Multinets - to examine the risk assessment record in England with respect to the final ‘Decision’ outcome made by an epidemiologist of ‘Alert’, ‘Monitor’ or ‘No-action’. Two further classifications based upon tree-augmented naïve Bayes and Multinets were implemented to account for the predominance of ‘No-action’ outcomes.

Results

The attributes of each individual risk assessment were linked to the final decision made by an epidemiologist, providing confidence in the current process. The naïve Bayesian classifier performed best, correctly classifying 51.5% of ‘Alert’ outcomes. If the ‘Alert’ and ‘Monitor’ actions are combined then performance increases to 82.6% correctly classified. We demonstrate how a decision support system based upon a naïve Bayes classifier could be operationalised within an operational syndromic surveillance system.

Conclusions

Within syndromic surveillance systems, machine learning techniques have the potential to make risk assessment following statistical alarms more automated, robust, and rigorous. However, our results also highlight the importance of specialist human input to the process.
Literature
1.
go back to reference Harcourt SE, Fletcher J, Loveridge P, Bains A, Morbey R, Yeates A, McCloskey B, Smyth B, Ibbotson S, Smith GE, et al. Developing a new syndromic surveillance system for the London 2012 Olympic and Paralympic games. Epidemiol Infect. 2012;140(12):2152–6.PubMedCrossRef Harcourt SE, Fletcher J, Loveridge P, Bains A, Morbey R, Yeates A, McCloskey B, Smyth B, Ibbotson S, Smith GE, et al. Developing a new syndromic surveillance system for the London 2012 Olympic and Paralympic games. Epidemiol Infect. 2012;140(12):2152–6.PubMedCrossRef
2.
go back to reference Vandentorren S, Paty AC, Baffert E, Chansard P, Caserio-Schonemann C. Syndromic surveillance during the Paris terrorist attacks. Lancet. 2016;387(10021):846–+.PubMedCrossRef Vandentorren S, Paty AC, Baffert E, Chansard P, Caserio-Schonemann C. Syndromic surveillance during the Paris terrorist attacks. Lancet. 2016;387(10021):846–+.PubMedCrossRef
3.
go back to reference Triple SP. Assessment of syndromic surveillance in Europe. Lancet. 2011;378(9806):1833–4.CrossRef Triple SP. Assessment of syndromic surveillance in Europe. Lancet. 2011;378(9806):1833–4.CrossRef
4.
go back to reference Elliot AJ, Hughes HE, Astbury J, Nixon G, Brierley K, Vivancos R, Inns T, Decraene V, Platt K, Lake I, et al. The potential impact of media reporting in syndromic surveillance: an example using a possible cryptosporidium exposure in north West England, august to september 2015. Eurosurveillance. 2016;21(41). Elliot AJ, Hughes HE, Astbury J, Nixon G, Brierley K, Vivancos R, Inns T, Decraene V, Platt K, Lake I, et al. The potential impact of media reporting in syndromic surveillance: an example using a possible cryptosporidium exposure in north West England, august to september 2015. Eurosurveillance. 2016;21(41).
5.
go back to reference Morbey RA, Elliot AJ, Charlett A, Verlander NQ, Andrews N, Smith GE. The application of a novel 'rising activity, multi-level mixed effects, indicator emphasis' (RAMMIE) method for syndromic surveillance in England. Bioinformatics. 2015;31(22):3660–5.PubMed Morbey RA, Elliot AJ, Charlett A, Verlander NQ, Andrews N, Smith GE. The application of a novel 'rising activity, multi-level mixed effects, indicator emphasis' (RAMMIE) method for syndromic surveillance in England. Bioinformatics. 2015;31(22):3660–5.PubMed
6.
go back to reference Noufaily A, Enki DG, Farrington P, Garthwaite P, Andrews N, Charlett A. An improved algorithm for outbreak detection in multiple surveillance systems. Stat Med. 2013;32(7):1206–22.PubMedCrossRef Noufaily A, Enki DG, Farrington P, Garthwaite P, Andrews N, Charlett A. An improved algorithm for outbreak detection in multiple surveillance systems. Stat Med. 2013;32(7):1206–22.PubMedCrossRef
7.
go back to reference Hutwagner L, Thompson W, Seeman GM, Treadwell T. The bioterrorism preparedness and response early aberration reporting system (EARS). J Urban Health. 2003;80(2):I89–96.PubMedPubMedCentral Hutwagner L, Thompson W, Seeman GM, Treadwell T. The bioterrorism preparedness and response early aberration reporting system (EARS). J Urban Health. 2003;80(2):I89–96.PubMedPubMedCentral
8.
go back to reference Smith GE, Elliot AJ, Ibbotson S, Morbey R, Edeghere O, Hawker J, Catchpole M, Endericks T, Fisher P, McCloskey B. Novel public health risk assessment process developed to support syndromic surveillance for the 2012 Olympic and Paralympic games. J Public Health-Uk. 2017;39(3):E111–7. Smith GE, Elliot AJ, Ibbotson S, Morbey R, Edeghere O, Hawker J, Catchpole M, Endericks T, Fisher P, McCloskey B. Novel public health risk assessment process developed to support syndromic surveillance for the 2012 Olympic and Paralympic games. J Public Health-Uk. 2017;39(3):E111–7.
9.
go back to reference Quinlan JR. Programs for machine learning. San Francisco: Morgan Kaufmann; 1993. Quinlan JR. Programs for machine learning. San Francisco: Morgan Kaufmann; 1993.
10.
go back to reference Harcourt SE, Morbey RA, Loveridge P, Carrilho L, Baynham D, Povey E, Fox P, Rutter J, Moores P, Tiffen J, et al. Developing and validating a new national remote health advice syndromic surveillance system in England. J Public Health-Uk. 2017;39(1):184–92. Harcourt SE, Morbey RA, Loveridge P, Carrilho L, Baynham D, Povey E, Fox P, Rutter J, Moores P, Tiffen J, et al. Developing and validating a new national remote health advice syndromic surveillance system in England. J Public Health-Uk. 2017;39(1):184–92.
11.
go back to reference Meyer N, McMenamin J, Robertson C, Donaghy M, Allardice G, Cooper D. A multi-data source surveillance system to detect a bioterrorism attack during the G8 summit in Scotland. Epidemiol Infect. 2008;136(7):876–85.PubMedCrossRef Meyer N, McMenamin J, Robertson C, Donaghy M, Allardice G, Cooper D. A multi-data source surveillance system to detect a bioterrorism attack during the G8 summit in Scotland. Epidemiol Infect. 2008;136(7):876–85.PubMedCrossRef
12.
go back to reference Cooper DL, Verlander NQ, Elliot AJ, Joseph CA, Smith GE. Can syndromic thresholds provide early warning of national influenza outbreaks? J Public Health-Uk. 2009;31(1):17–25.CrossRef Cooper DL, Verlander NQ, Elliot AJ, Joseph CA, Smith GE. Can syndromic thresholds provide early warning of national influenza outbreaks? J Public Health-Uk. 2009;31(1):17–25.CrossRef
13.
go back to reference Friedman N, Geiger D, Goldszmidt M. Bayesian network classifiers. Mach Learn. 1997;29(2–3):131–63.CrossRef Friedman N, Geiger D, Goldszmidt M. Bayesian network classifiers. Mach Learn. 1997;29(2–3):131–63.CrossRef
14.
go back to reference Heckerman D. A tutorial on learning Bayesian networks. In: Technical report MSR-TR-95-06. Redmond, WA: Microsoft Research; 1995. Heckerman D. A tutorial on learning Bayesian networks. In: Technical report MSR-TR-95-06. Redmond, WA: Microsoft Research; 1995.
15.
go back to reference Langley P, Sage S. Induction of selective Bayesian classifiers. In: 10th conference onuncertainty in artificial intelligence, vol. 1994. Seattle: Morgan Kaufmann; 1994. p. 399–406. Langley P, Sage S. Induction of selective Bayesian classifiers. In: 10th conference onuncertainty in artificial intelligence, vol. 1994. Seattle: Morgan Kaufmann; 1994. p. 399–406.
16.
go back to reference Chow CK, Liu CN. Approximating discrete probability distributions with dependency trees. IEEE Trans Inf Theory. 1968;14:462–7.CrossRef Chow CK, Liu CN. Approximating discrete probability distributions with dependency trees. IEEE Trans Inf Theory. 1968;14:462–7.CrossRef
17.
go back to reference Cowell RG, David AP, Lauritzen SL, Spiegelhalter DJ. Probabilistic networks and expert systems. New York: Springer-Verlag; 1999. Cowell RG, David AP, Lauritzen SL, Spiegelhalter DJ. Probabilistic networks and expert systems. New York: Springer-Verlag; 1999.
18.
go back to reference Geiger D, Heckerman D. Knowledge representation and inference in similarity networks and Bayesian multinets. Artif Intell. 1996;82:45–74.CrossRef Geiger D, Heckerman D. Knowledge representation and inference in similarity networks and Bayesian multinets. Artif Intell. 1996;82:45–74.CrossRef
19.
go back to reference Chawla NV, Bowyer KW, L.O. H, Kegelmeyer WP: SMOTE: synthetic minority oversampling technique. J Artif Intell Res 2002, 16:321–357.CrossRef Chawla NV, Bowyer KW, L.O. H, Kegelmeyer WP: SMOTE: synthetic minority oversampling technique. J Artif Intell Res 2002, 16:321–357.CrossRef
20.
go back to reference Qiao XY, Liu YF. Adaptive weighted learning for unbalanced multicategory classification. Biometrics. 2009;65(1):159–68.PubMedCrossRef Qiao XY, Liu YF. Adaptive weighted learning for unbalanced multicategory classification. Biometrics. 2009;65(1):159–68.PubMedCrossRef
21.
go back to reference Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In: 14TH international joint conference on artificial intelligence: 1995. Montreal: Morgan Kaufmann; 1995. p. 1137–43. Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In: 14TH international joint conference on artificial intelligence: 1995. Montreal: Morgan Kaufmann; 1995. p. 1137–43.
22.
go back to reference Bowes D, Hall T, Gray D. DConfusion: a technique to allow cross study performance evaluation of fault prediction studies. Automat Softw Eng. 2014;21(2):287–313.CrossRef Bowes D, Hall T, Gray D. DConfusion: a technique to allow cross study performance evaluation of fault prediction studies. Automat Softw Eng. 2014;21(2):287–313.CrossRef
23.
go back to reference Hripcsak G. Visualizing the operating range of a classification system. J Am Med Inform Assn. 2012;19(4):529–32.CrossRef Hripcsak G. Visualizing the operating range of a classification system. J Am Med Inform Assn. 2012;19(4):529–32.CrossRef
24.
go back to reference Lenz HJ. Why the naive Bayesian classifier for clinical diagnostics or monitoring can dominate the proper one even for massive data sets. Frontiers in Statistical Quality Control. 2015;11:385–93. Lenz HJ. Why the naive Bayesian classifier for clinical diagnostics or monitoring can dominate the proper one even for massive data sets. Frontiers in Statistical Quality Control. 2015;11:385–93.
25.
go back to reference Royal Society. Policy project on machine learning. London: royal Society; 2018. Royal Society. Policy project on machine learning. London: royal Society; 2018.
26.
go back to reference Geenen PL, van der Gaag LC, Loeffen WLA, Elbers ARW. Constructing naive Bayesian classifiers for veterinary medicine: a case study in the clinical diagnosis of classical swine fever. Res Vet Sci. 2011;91(1):64–70.PubMedCrossRef Geenen PL, van der Gaag LC, Loeffen WLA, Elbers ARW. Constructing naive Bayesian classifiers for veterinary medicine: a case study in the clinical diagnosis of classical swine fever. Res Vet Sci. 2011;91(1):64–70.PubMedCrossRef
27.
go back to reference Hu XH, Cammann H, Meyer HA, Miller K, Jung K, Stephan C. Artificial neural networks and prostate cancer-tools for diagnosis and management. Nat Rev Urol. 2013;10(3):174–82.PubMedCrossRef Hu XH, Cammann H, Meyer HA, Miller K, Jung K, Stephan C. Artificial neural networks and prostate cancer-tools for diagnosis and management. Nat Rev Urol. 2013;10(3):174–82.PubMedCrossRef
29.
go back to reference Pineda AL, Ye Y, Visweswaran S, Cooper GF, Wagner MM, Tsui F. Comparison of machine learning classifiers for influenza detection from emergency department free-text reports. J Biomed Inform. 2015;58:60–9.PubMedCentralCrossRef Pineda AL, Ye Y, Visweswaran S, Cooper GF, Wagner MM, Tsui F. Comparison of machine learning classifiers for influenza detection from emergency department free-text reports. J Biomed Inform. 2015;58:60–9.PubMedCentralCrossRef
30.
go back to reference Iguyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res. 2003;3:1157–82. Iguyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res. 2003;3:1157–82.
31.
go back to reference Svensson CM, Hubler R, Figge MT. Automated classification of circulating tumor cells and the impact of Interobsever variability on classifier training and performance. J Immunol Res. 2015;2015:573165.PubMedPubMedCentralCrossRef Svensson CM, Hubler R, Figge MT. Automated classification of circulating tumor cells and the impact of Interobsever variability on classifier training and performance. J Immunol Res. 2015;2015:573165.PubMedPubMedCentralCrossRef
Metadata
Title
Machine learning to refine decision making within a syndromic surveillance service
Authors
I. R. Lake
F. J. Colón-González
G. C. Barker
R. A. Morbey
G. E. Smith
A. J. Elliot
Publication date
01-12-2019
Publisher
BioMed Central
Keyword
Public Health
Published in
BMC Public Health / Issue 1/2019
Electronic ISSN: 1471-2458
DOI
https://doi.org/10.1186/s12889-019-6916-9

Other articles of this Issue 1/2019

BMC Public Health 1/2019 Go to the issue