Skip to main content
Top
Published in: BMC Infectious Diseases 1/2020

01-12-2020 | Diarrhea | Research article

Forecasting incidence of infectious diarrhea using random forest in Jiangsu Province, China

Authors: Xinyu Fang, Wendong Liu, Jing Ai, Mike He, Ying Wu, Yingying Shi, Wenqi Shen, Changjun Bao

Published in: BMC Infectious Diseases | Issue 1/2020

Login to get access

Abstract

Background

Infectious diarrhea can lead to a considerable global disease burden. Thus, the accurate prediction of an infectious diarrhea epidemic is crucial for public health authorities. This study was aimed at developing an optimal random forest (RF) model, considering meteorological factors used to predict an incidence of infectious diarrhea in Jiangsu Province, China.

Methods

An RF model was developed and compared with classical autoregressive integrated moving average (ARIMA)/X models. Morbidity and meteorological data from 2012 to 2016 were used to construct the models and the data from 2017 were used for testing.

Results

The RF model considered atmospheric pressure, precipitation, relative humidity, and their lagged terms, as well as 1–4 week lag morbidity and time variable as the predictors. Meanwhile, a univariate model ARIMA (1,0,1)(1,0,0)52 (AIC = − 575.92, BIC = − 558.14) and a multivariable model ARIMAX (1,0,1)(1,0,0)52 with 0–1 week lag precipitation (AIC = − 578.58, BIC = − 578.13) were developed as benchmarks. The RF model outperformed the ARIMA/X models with a mean absolute percentage error (MAPE) of approximately 20%. The performance of the ARIMAX model was comparable to that of the ARIMA model with a MAPE reaching approximately 30%.

Conclusions

The RF model fitted the dynamic nature of an infectious diarrhea epidemic well and delivered an ideal prediction accuracy. It comprehensively combined the synchronous and lagged effects of meteorological factors; it also integrated the autocorrelation and seasonality of the morbidity. The RF model can be used to predict the epidemic level and has a high potential for practical implementation.
Literature
1.
go back to reference GBD 2015 Disease and Injury Incidence and Prevalence Collaborators. Global, regional, and national incidence, prevalence, and years lived with disability for 310 diseases and injuries, 1990–2015: a systematic analysis for the Global Burden of Disease Study 2015. Lancet. 2016;388(10053):1545–602.CrossRef GBD 2015 Disease and Injury Incidence and Prevalence Collaborators. Global, regional, and national incidence, prevalence, and years lived with disability for 310 diseases and injuries, 1990–2015: a systematic analysis for the Global Burden of Disease Study 2015. Lancet. 2016;388(10053):1545–602.CrossRef
2.
go back to reference GBD 2015 Mortality and Causes of Death Collaborators. Global, regional, and national life expectancy, all-cause mortality, and cause-specific mortality for 249 causes of death, 1980–2015: a systematic analysis for the Global Burden of Disease Study 2015. Lancet. 2016;388(10053):1459–544.CrossRef GBD 2015 Mortality and Causes of Death Collaborators. Global, regional, and national life expectancy, all-cause mortality, and cause-specific mortality for 249 causes of death, 1980–2015: a systematic analysis for the Global Burden of Disease Study 2015. Lancet. 2016;388(10053):1459–544.CrossRef
3.
go back to reference Zhang P, Zhang J. Surveillance on other infectious diarrheal diseases in China from 2014 to 2015. Chin J Epidemiol 2017;38(4):424–430.(in Chinese). Zhang P, Zhang J. Surveillance on other infectious diarrheal diseases in China from 2014 to 2015. Chin J Epidemiol 2017;38(4):424–430.(in Chinese).
4.
go back to reference Yang E, Park HW, Choi YH, Kim J, Munkhdalai L, Musa I, et al. A simulation-based study on the comparison of statistical and time series forecasting methods for early detection of infectious disease outbreaks. Int J Environ Res Public Health. 2018;15(5):966.CrossRef Yang E, Park HW, Choi YH, Kim J, Munkhdalai L, Musa I, et al. A simulation-based study on the comparison of statistical and time series forecasting methods for early detection of infectious disease outbreaks. Int J Environ Res Public Health. 2018;15(5):966.CrossRef
5.
go back to reference Zhang Y, Bi P, Hiller JE, Sun Y, Ryan P. Climate variations and bacillary dysentery in northern and southern cities of China. J Inf Secur. 2007;55(2):194–200. Zhang Y, Bi P, Hiller JE, Sun Y, Ryan P. Climate variations and bacillary dysentery in northern and southern cities of China. J Inf Secur. 2007;55(2):194–200.
6.
go back to reference Gao L, Zhang Y, Ding G, Liu Q, Zhou M, Li X, et al. Meteorological variables and bacillary dysentery cases in Changsha City. China Am J Trop Med Hyg. 2014;90(4):697–704.CrossRef Gao L, Zhang Y, Ding G, Liu Q, Zhou M, Li X, et al. Meteorological variables and bacillary dysentery cases in Changsha City. China Am J Trop Med Hyg. 2014;90(4):697–704.CrossRef
7.
go back to reference Yan L, Wang H, Zhang X, Li MY, He J. Impact of meteorological factors on the incidence of bacillary dysentery in Beijing, China: a time series analysis (1970-2012). PLoS One. 2017;12(8):e0182937.CrossRef Yan L, Wang H, Zhang X, Li MY, He J. Impact of meteorological factors on the incidence of bacillary dysentery in Beijing, China: a time series analysis (1970-2012). PLoS One. 2017;12(8):e0182937.CrossRef
8.
go back to reference Chou WC, Wu JL, Wang YC, Huang H, Sung FC, Chuang CY. Modeling the impact of climate variability on diarrhea-associated diseases in Taiwan (1996-2007). Sci Total Environ. 2010;409(1):43–51.CrossRef Chou WC, Wu JL, Wang YC, Huang H, Sung FC, Chuang CY. Modeling the impact of climate variability on diarrhea-associated diseases in Taiwan (1996-2007). Sci Total Environ. 2010;409(1):43–51.CrossRef
9.
go back to reference Phung D, Huang C, Rutherford S, Chu C, Wang X, Nguyen M, et al. Association between climate factors and diarrhea in a Mekong Delta area. Int J Biometeorol. 2015;59(9):1321–31.CrossRef Phung D, Huang C, Rutherford S, Chu C, Wang X, Nguyen M, et al. Association between climate factors and diarrhea in a Mekong Delta area. Int J Biometeorol. 2015;59(9):1321–31.CrossRef
10.
go back to reference Li Z, Wang L, Sun W, Hou X, Yang H, Sun L, et al. Identifying high-risk areas of bacillary dysentery and associated meteorological factors in Wuhan. China Sci Rep. 2013;3(1):3239.CrossRef Li Z, Wang L, Sun W, Hou X, Yang H, Sun L, et al. Identifying high-risk areas of bacillary dysentery and associated meteorological factors in Wuhan. China Sci Rep. 2013;3(1):3239.CrossRef
11.
go back to reference Breiman L. Random Forest. Berkeley: University of California; 2001. Breiman L. Random Forest. Berkeley: University of California; 2001.
12.
go back to reference Keyel AC, Elison Timm O, Backenson PB, Prussing C, Quinones S, McDonough KA, et al. Seasonal temperatures and hydrological conditions improve the prediction of West Nile virus infection rates in Culex mosquitoes and human case counts in New York and Connecticut. PLoS One. 2019;14(6):e0217854.CrossRef Keyel AC, Elison Timm O, Backenson PB, Prussing C, Quinones S, McDonough KA, et al. Seasonal temperatures and hydrological conditions improve the prediction of West Nile virus infection rates in Culex mosquitoes and human case counts in New York and Connecticut. PLoS One. 2019;14(6):e0217854.CrossRef
13.
go back to reference Machado G, Mendoza MR, Corbellini LG. What variables are important in predicting bovine viral diarrhea virus? A random forest approach. Vet Res. 2015;24(1):46–85. Machado G, Mendoza MR, Corbellini LG. What variables are important in predicting bovine viral diarrhea virus? A random forest approach. Vet Res. 2015;24(1):46–85.
14.
go back to reference Kane MJ, Price N, Scotch M, Rabinowitz P. Comparison of ARIMA and random Forest time series models for prediction of avian influenza H5N1 outbreaks. BMC Bioinformatics. 2014;15(1):276.CrossRef Kane MJ, Price N, Scotch M, Rabinowitz P. Comparison of ARIMA and random Forest time series models for prediction of avian influenza H5N1 outbreaks. BMC Bioinformatics. 2014;15(1):276.CrossRef
15.
go back to reference Wang L, Wang Y, Jin S, Wu Z, Chin DP, Koplan JP, et al. Emergence and control of infectious diseases in China. Lancet. 2008;372(9649):1598–605.CrossRef Wang L, Wang Y, Jin S, Wu Z, Chin DP, Koplan JP, et al. Emergence and control of infectious diseases in China. Lancet. 2008;372(9649):1598–605.CrossRef
16.
go back to reference Box G, Jenkins G, Reinsel G. Time series analysis: forecasting and control. Hoboken. New Jersey: John Wiley & Sons; 2008.CrossRef Box G, Jenkins G, Reinsel G. Time series analysis: forecasting and control. Hoboken. New Jersey: John Wiley & Sons; 2008.CrossRef
17.
go back to reference Xu Q, Li R, Liu Y, Luo C, Xu A, Xue F, et al. Forecasting the incidence of mumps in Zibo City based on a SARIMA model. Int J Environ Res Public Health. 2017;14(18):925.CrossRef Xu Q, Li R, Liu Y, Luo C, Xu A, Xue F, et al. Forecasting the incidence of mumps in Zibo City based on a SARIMA model. Int J Environ Res Public Health. 2017;14(18):925.CrossRef
18.
go back to reference Tian CW, Wang H, Luo XM. Time-series modelling and forecasting of hand, foot and mouth disease cases in China from 2008 to 2018. Epidemiol Infect. 2019;147(1):28. Tian CW, Wang H, Luo XM. Time-series modelling and forecasting of hand, foot and mouth disease cases in China from 2008 to 2018. Epidemiol Infect. 2019;147(1):28.
19.
go back to reference Zhang Y, Bi P, Hiller JE. Meteorological variables and malaria in a Chinese temperate city: a twenty-year time-series data analysis. Environ Int. 2010;36(5):439–45.CrossRef Zhang Y, Bi P, Hiller JE. Meteorological variables and malaria in a Chinese temperate city: a twenty-year time-series data analysis. Environ Int. 2010;36(5):439–45.CrossRef
20.
go back to reference Tao Y, Liu ZM, Mi SQ, Song J, Qiang L. Effects of meteorological factors on other types of infectious diarrhea. J Lanzhou Univ: Nat Sci. 2015;51(5):646–51 (in Chinese). Tao Y, Liu ZM, Mi SQ, Song J, Qiang L. Effects of meteorological factors on other types of infectious diarrhea. J Lanzhou Univ: Nat Sci. 2015;51(5):646–51 (in Chinese).
21.
go back to reference Wang J, Xu MM, Mo YZ, Pan XC. Correlation between meteorological factors and infectious diarrhea in a district of Beijing. J Environ Health. 2013;30(11):991–5 (in Chinese). Wang J, Xu MM, Mo YZ, Pan XC. Correlation between meteorological factors and infectious diarrhea in a district of Beijing. J Environ Health. 2013;30(11):991–5 (in Chinese).
22.
go back to reference Fu JG, Shi C, Xu C, Lin Q, Zhang J, Yi QH, et al. Outbreaks of acute gastroenteritis associated with a re-emerging GII.P16-GII.2 norovirus in the spring of 2017 in Jiangsu, China. PLoS One. 2017;12(12):e0186090.CrossRef Fu JG, Shi C, Xu C, Lin Q, Zhang J, Yi QH, et al. Outbreaks of acute gastroenteritis associated with a re-emerging GII.P16-GII.2 norovirus in the spring of 2017 in Jiangsu, China. PLoS One. 2017;12(12):e0186090.CrossRef
23.
go back to reference Ma T, Zhang M, Hong L, Wang X, Dai WJ, Wu ZW, et al. Outbreak investigation of acute gastroenteritis associated with GII.P7-GII.6 norovirus in a primary school of Nanjing in 2017. Mod Pre Med. 2018;45(22):4188–91 (in Chinese). Ma T, Zhang M, Hong L, Wang X, Dai WJ, Wu ZW, et al. Outbreak investigation of acute gastroenteritis associated with GII.P7-GII.6 norovirus in a primary school of Nanjing in 2017. Mod Pre Med. 2018;45(22):4188–91 (in Chinese).
24.
go back to reference Chan MC, Mok HY, Lee TC, Nelson EA, Leung TF, Tam WW, et al. Rotavirus activity and meteorological variations in an Asian subtropical city, Hong Kong, 1995-2009. J Med Virol. 2013;85(11):2026–33.CrossRef Chan MC, Mok HY, Lee TC, Nelson EA, Leung TF, Tam WW, et al. Rotavirus activity and meteorological variations in an Asian subtropical city, Hong Kong, 1995-2009. J Med Virol. 2013;85(11):2026–33.CrossRef
25.
go back to reference Wang H, Di B, Zhang T, Lu Y, Chen C, Wang D, et al. Association of meteorological factors with infectious diarrhea incidence in Guangzhou, southern China: a time-series study (2006-2017). Sci Total Environ. 2019;672(2019):7–15.CrossRef Wang H, Di B, Zhang T, Lu Y, Chen C, Wang D, et al. Association of meteorological factors with infectious diarrhea incidence in Guangzhou, southern China: a time-series study (2006-2017). Sci Total Environ. 2019;672(2019):7–15.CrossRef
Metadata
Title
Forecasting incidence of infectious diarrhea using random forest in Jiangsu Province, China
Authors
Xinyu Fang
Wendong Liu
Jing Ai
Mike He
Ying Wu
Yingying Shi
Wenqi Shen
Changjun Bao
Publication date
01-12-2020
Publisher
BioMed Central
Keywords
Diarrhea
Diarrhea
Published in
BMC Infectious Diseases / Issue 1/2020
Electronic ISSN: 1471-2334
DOI
https://doi.org/10.1186/s12879-020-4930-2

Other articles of this Issue 1/2020

BMC Infectious Diseases 1/2020 Go to the issue