Skip to main content
Top
Published in: BMC Medicine 1/2022

Open Access 01-12-2022 | Premature Birth | Research article

Dense phenotyping from electronic health records enables machine learning-based prediction of preterm birth

Authors: Abin Abraham, Brian Le, Idit Kosti, Peter Straub, Digna R. Velez-Edwards, Lea K. Davis, J. M. Newton, Louis J. Muglia, Antonis Rokas, Cosmin A. Bejan, Marina Sirota, John A. Capra

Published in: BMC Medicine | Issue 1/2022

Login to get access

Abstract

Background

Identifying pregnancies at risk for preterm birth, one of the leading causes of worldwide infant mortality, has the potential to improve prenatal care. However, we lack broadly applicable methods to accurately predict preterm birth risk. The dense longitudinal information present in electronic health records (EHRs) is enabling scalable and cost-efficient risk modeling of many diseases, but EHR resources have been largely untapped in the study of pregnancy.

Methods

Here, we apply machine learning to diverse data from EHRs with 35,282 deliveries to predict singleton preterm birth.

Results

We find that machine learning models based on billing codes alone can predict preterm birth risk at various gestational ages (e.g., ROC-AUC = 0.75, PR-AUC = 0.40 at 28 weeks of gestation) and outperform comparable models trained using known risk factors (e.g., ROC-AUC = 0.65, PR-AUC = 0.25 at 28 weeks). Examining the patterns learned by the model reveals it stratifies deliveries into interpretable groups, including high-risk preterm birth subtypes enriched for distinct comorbidities. Our machine learning approach also predicts preterm birth subtypes (spontaneous vs. indicated), mode of delivery, and recurrent preterm birth. Finally, we demonstrate the portability of our approach by showing that the prediction models maintain their accuracy on a large, independent cohort (5978 deliveries) from a different healthcare system.

Conclusions

By leveraging rich phenotypic and genetic features derived from EHRs, we suggest that machine learning algorithms have great potential to improve medical care during pregnancy. However, further work is needed before these models can be applied in clinical settings.
Appendix
Available only for authorised users
Literature
1.
go back to reference Goldenberg RL, Culhane JF, Iams JD, Romero R. Epidemiology and causes of preterm birth. Lancet Lond Engl. 2008;371:75–84.CrossRef Goldenberg RL, Culhane JF, Iams JD, Romero R. Epidemiology and causes of preterm birth. Lancet Lond Engl. 2008;371:75–84.CrossRef
2.
go back to reference Blencowe H, Cousens S, Oestergaard MZ, Chou D, Moller A-B, Narwal R, et al. National, regional, and worldwide estimates of preterm birth rates in the year 2010 with time trends since 1990 for selected countries: a systematic analysis and implications. Lancet Lond Engl. 2012;379:2162–72.CrossRef Blencowe H, Cousens S, Oestergaard MZ, Chou D, Moller A-B, Narwal R, et al. National, regional, and worldwide estimates of preterm birth rates in the year 2010 with time trends since 1990 for selected countries: a systematic analysis and implications. Lancet Lond Engl. 2012;379:2162–72.CrossRef
3.
go back to reference Barros FC, Papageorghiou AT, Victora CG, Noble JA, Pang R, Iams J, et al. The distribution of clinical phenotypes of preterm birth syndrome. JAMA Pediatr. 2015;169:220–10.PubMedCrossRef Barros FC, Papageorghiou AT, Victora CG, Noble JA, Pang R, Iams J, et al. The distribution of clinical phenotypes of preterm birth syndrome. JAMA Pediatr. 2015;169:220–10.PubMedCrossRef
4.
go back to reference Callaghan WM, MacDorman MF, Rasmussen SA, Qin C, Lackritz EM. The contribution of preterm birth to infant mortality rates in the United States. Pediatrics. 2006;118:1566–73.PubMedCrossRef Callaghan WM, MacDorman MF, Rasmussen SA, Qin C, Lackritz EM. The contribution of preterm birth to infant mortality rates in the United States. Pediatrics. 2006;118:1566–73.PubMedCrossRef
5.
go back to reference Liu L, Oza S, Hogan D, Chu Y, Perin J, Zhu J, et al. Global, regional, and national causes of under-5 mortality in 2000–15: an updated systematic analysis with implications for the Sustainable Development Goals. Lancet. 2016;388:3027–35.PubMedPubMedCentralCrossRef Liu L, Oza S, Hogan D, Chu Y, Perin J, Zhu J, et al. Global, regional, and national causes of under-5 mortality in 2000–15: an updated systematic analysis with implications for the Sustainable Development Goals. Lancet. 2016;388:3027–35.PubMedPubMedCentralCrossRef
7.
go back to reference Iams J, Goldenberg R, Meis P, Mercer B, Moawad A, Das A, et al. The length of the cervix and the risk of spontaneous premature delivery. New Engl J Med. 1996;334:567–73.PubMedCrossRef Iams J, Goldenberg R, Meis P, Mercer B, Moawad A, Das A, et al. The length of the cervix and the risk of spontaneous premature delivery. New Engl J Med. 1996;334:567–73.PubMedCrossRef
8.
go back to reference Fuchs F, Monet B, Ducruet T, Chaillet N, Audibert F. Effect of maternal age on the risk of preterm birth: a large cohort study. PLoS One. 2018;13:e0191002.PubMedPubMedCentralCrossRef Fuchs F, Monet B, Ducruet T, Chaillet N, Audibert F. Effect of maternal age on the risk of preterm birth: a large cohort study. PLoS One. 2018;13:e0191002.PubMedPubMedCentralCrossRef
9.
go back to reference Mercer BM, Goldenberg RL, Moawad AH, Meis PJ, Iams JD, Das AF, et al. The preterm prediction study: effect of gestational age and cause of preterm birth on subsequent obstetric outcome. Am J Obstet Gynecol. 1999;181:1216–21.PubMedCrossRef Mercer BM, Goldenberg RL, Moawad AH, Meis PJ, Iams JD, Das AF, et al. The preterm prediction study: effect of gestational age and cause of preterm birth on subsequent obstetric outcome. Am J Obstet Gynecol. 1999;181:1216–21.PubMedCrossRef
11.
go back to reference Ananth CV, Kirby RS, Vintzileos AM. Recurrence of preterm birth in twin pregnancies in the presence of a prior singleton preterm birth. J Maternal Fetal Neonatal Med. 2008;21:289–95.CrossRef Ananth CV, Kirby RS, Vintzileos AM. Recurrence of preterm birth in twin pregnancies in the presence of a prior singleton preterm birth. J Maternal Fetal Neonatal Med. 2008;21:289–95.CrossRef
12.
13.
go back to reference Auger N, Le TUN, Park AL, Luo Z-C. Association between maternal comorbidity and preterm birth by severity and clinical subtype: retrospective cohort study. BMC Pregnancy Childbirth. 2011;11:75.CrossRef Auger N, Le TUN, Park AL, Luo Z-C. Association between maternal comorbidity and preterm birth by severity and clinical subtype: retrospective cohort study. BMC Pregnancy Childbirth. 2011;11:75.CrossRef
14.
go back to reference Carter M, Fowler S, Holden A, Xenakis E, Dudley D. The late preterm birth rate and its association with comorbidities in a population-based study. Am J Perinatol. 2011;28:703–8.PubMedCrossRef Carter M, Fowler S, Holden A, Xenakis E, Dudley D. The late preterm birth rate and its association with comorbidities in a population-based study. Am J Perinatol. 2011;28:703–8.PubMedCrossRef
15.
go back to reference Francesca L, Laura M, Giuseppe R, Francesco DA, Ersilia B, Leonardo P, et al. Biomarkers for predicting spontaneous preterm birth: an umbrella systematic review. J Matern Fetal Neonatal Med. 2019;0:726–34. Francesca L, Laura M, Giuseppe R, Francesco DA, Ersilia B, Leonardo P, et al. Biomarkers for predicting spontaneous preterm birth: an umbrella systematic review. J Matern Fetal Neonatal Med. 2019;0:726–34.
16.
go back to reference Dabi Y, Nedellec S, Bonneau C, Trouchard B, Rouzier R, Benachi A. Clinical validation of a model predicting the risk of preterm delivery. PLoS One. 2017;12:e0171801.PubMedPubMedCentralCrossRef Dabi Y, Nedellec S, Bonneau C, Trouchard B, Rouzier R, Benachi A. Clinical validation of a model predicting the risk of preterm delivery. PLoS One. 2017;12:e0171801.PubMedPubMedCentralCrossRef
17.
go back to reference Ngo TTM, Moufarrej MN, Rasmussen M-LH, Camunas-Soler J, Pan W, Okamoto J, et al. Noninvasive blood tests for fetal development predict gestational age and preterm delivery. Science. 2018;360:1133–6.PubMedPubMedCentralCrossRef Ngo TTM, Moufarrej MN, Rasmussen M-LH, Camunas-Soler J, Pan W, Okamoto J, et al. Noninvasive blood tests for fetal development predict gestational age and preterm delivery. Science. 2018;360:1133–6.PubMedPubMedCentralCrossRef
18.
go back to reference Tarca AL, Pataki BÁ, Romero R, Sirota M, Guan Y, Kutum R, et al. Crowdsourcing assessment of maternal blood multi-omics for predicting gestational age and preterm birth. Cell Rep Med. 2021;2:100323.PubMedPubMedCentralCrossRef Tarca AL, Pataki BÁ, Romero R, Sirota M, Guan Y, Kutum R, et al. Crowdsourcing assessment of maternal blood multi-omics for predicting gestational age and preterm birth. Cell Rep Med. 2021;2:100323.PubMedPubMedCentralCrossRef
19.
go back to reference Stelzer IA, Ghaemi MS, Han X, Ando K, Hédou JJ, Feyaerts D, et al. Integrated trajectories of the maternal metabolome, proteome, and immunome predict labor onset. Sci Transl Med. 2021;13:eabd9898.PubMedPubMedCentralCrossRef Stelzer IA, Ghaemi MS, Han X, Ando K, Hédou JJ, Feyaerts D, et al. Integrated trajectories of the maternal metabolome, proteome, and immunome predict labor onset. Sci Transl Med. 2021;13:eabd9898.PubMedPubMedCentralCrossRef
20.
go back to reference Schaaf JM, Ravelli ACJ, Mol BWJ, Abu-Hanna A. Development of a prognostic model for predicting spontaneous singleton preterm birth. Eur J Obstet Gynecol Reprod Biol. 2012;164:150–5.PubMedCrossRef Schaaf JM, Ravelli ACJ, Mol BWJ, Abu-Hanna A. Development of a prognostic model for predicting spontaneous singleton preterm birth. Eur J Obstet Gynecol Reprod Biol. 2012;164:150–5.PubMedCrossRef
21.
go back to reference Morken NH, Källen K, Jacobsson B. Predicting risk of spontaneous preterm delivery in women with a singleton pregnancy. Paediatr Perinat Epidemiol. 2014;28:11–22.PubMedCrossRef Morken NH, Källen K, Jacobsson B. Predicting risk of spontaneous preterm delivery in women with a singleton pregnancy. Paediatr Perinat Epidemiol. 2014;28:11–22.PubMedCrossRef
22.
go back to reference Weber A, Darmstadt GL, Gruber S, Foeller ME, Carmichael SL, Stevenson DK, et al. Application of machine-learning to predict early spontaneous preterm birth among nulliparous non-Hispanic black and white women. Ann Epidemiol. 2018;28:783–789.e1.PubMedCrossRef Weber A, Darmstadt GL, Gruber S, Foeller ME, Carmichael SL, Stevenson DK, et al. Application of machine-learning to predict early spontaneous preterm birth among nulliparous non-Hispanic black and white women. Ann Epidemiol. 2018;28:783–789.e1.PubMedCrossRef
23.
go back to reference Baer RJ, McLemore MR, Adler N, Oltman SP, Chambers BD, Kuppermann M, et al. Pre-pregnancy or first-trimester risk scoring to identify women at high risk of preterm birth. Eur J Obstet Gynecol. 2018;231:235–40.CrossRef Baer RJ, McLemore MR, Adler N, Oltman SP, Chambers BD, Kuppermann M, et al. Pre-pregnancy or first-trimester risk scoring to identify women at high risk of preterm birth. Eur J Obstet Gynecol. 2018;231:235–40.CrossRef
24.
go back to reference Tucker CM, Berrien K, Menard MK, Herring AH, Daniels J, Rowley DL, et al. Predicting preterm birth among women screened by North Carolina’s pregnancy medical home program. Matern Child Health J. 2015;19:2438–52.PubMedPubMedCentralCrossRef Tucker CM, Berrien K, Menard MK, Herring AH, Daniels J, Rowley DL, et al. Predicting preterm birth among women screened by North Carolina’s pregnancy medical home program. Matern Child Health J. 2015;19:2438–52.PubMedPubMedCentralCrossRef
25.
go back to reference Suff N, Story L, Shennan A. The prediction of preterm delivery: what is new? Semin Fetal Neonat M. 2018;24:27–32.CrossRef Suff N, Story L, Shennan A. The prediction of preterm delivery: what is new? Semin Fetal Neonat M. 2018;24:27–32.CrossRef
27.
28.
go back to reference Artzi NS, Shilo S, Hadar E, Rossman H, Barbash-Hazan S, Ben-Haroush A, et al. Prediction of gestational diabetes based on nationwide electronic health records. Nat Med. 2020;26:71–6.PubMedCrossRef Artzi NS, Shilo S, Hadar E, Rossman H, Barbash-Hazan S, Ben-Haroush A, et al. Prediction of gestational diabetes based on nationwide electronic health records. Nat Med. 2020;26:71–6.PubMedCrossRef
29.
go back to reference Ravizza S, Huschto T, Adamov A, Böhm L, Büsser A, Flöther FF, et al. Predicting the early risk of chronic kidney disease in patients with diabetes using real-world data. Nat Med. 2019;25:57–9.PubMedCrossRef Ravizza S, Huschto T, Adamov A, Böhm L, Büsser A, Flöther FF, et al. Predicting the early risk of chronic kidney disease in patients with diabetes using real-world data. Nat Med. 2019;25:57–9.PubMedCrossRef
30.
go back to reference Li R, Chen Y, Ritchie MD, Moore JH. Electronic health records and polygenic risk scores for predicting disease risk. Nat Publ Group. 2020;31:1–10. Li R, Chen Y, Ritchie MD, Moore JH. Electronic health records and polygenic risk scores for predicting disease risk. Nat Publ Group. 2020;31:1–10.
31.
go back to reference Zhang G, Feenstra B, Bacelis J, Liu X, Muglia LM, Juodakis J, et al. Genetic associations with gestational duration and spontaneous preterm birth. New Engl J Med. 2017;377:1156–67.PubMedCrossRef Zhang G, Feenstra B, Bacelis J, Liu X, Muglia LM, Juodakis J, et al. Genetic associations with gestational duration and spontaneous preterm birth. New Engl J Med. 2017;377:1156–67.PubMedCrossRef
32.
go back to reference Tomašev N, Glorot X, Rae JW, Zielinski M, Askham H, Saraiva A, et al. A clinically applicable approach to continuous prediction of future acute kidney injury. Nature. 2019;572:116–9.PubMedPubMedCentralCrossRef Tomašev N, Glorot X, Rae JW, Zielinski M, Askham H, Saraiva A, et al. A clinically applicable approach to continuous prediction of future acute kidney injury. Nature. 2019;572:116–9.PubMedPubMedCentralCrossRef
33.
go back to reference Xiao C, Choi E, Sun J. Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review. J Am Med Inform Assn. 2018;25:1419–28.CrossRef Xiao C, Choi E, Sun J. Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review. J Am Med Inform Assn. 2018;25:1419–28.CrossRef
34.
go back to reference Zhao J, Feng Q, Wu P, Lupu RA, Wilke RA, Wells QS, et al. Learning from longitudinal data in electronic health record and genetic data to improve cardiovascular event prediction. Sci Rep. 2019;9:1–10. Zhao J, Feng Q, Wu P, Lupu RA, Wilke RA, Wells QS, et al. Learning from longitudinal data in electronic health record and genetic data to improve cardiovascular event prediction. Sci Rep. 2019;9:1–10.
35.
go back to reference Goldstein BA, Navar AM, Pencina MJ, Ioannidis JPA. Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review. J Am Med Inform Assoc. 2017;24:198–208.PubMedCrossRef Goldstein BA, Navar AM, Pencina MJ, Ioannidis JPA. Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review. J Am Med Inform Assoc. 2017;24:198–208.PubMedCrossRef
36.
go back to reference Aung MT, Yu Y, Ferguson KK, Cantonwine DE, Zeng L, McElrath TF, et al. Prediction and associations of preterm birth and its subtypes with eicosanoid enzymatic pathways and inflammatory markers. Sci Rep. 2019;9:17049.PubMedPubMedCentralCrossRef Aung MT, Yu Y, Ferguson KK, Cantonwine DE, Zeng L, McElrath TF, et al. Prediction and associations of preterm birth and its subtypes with eicosanoid enzymatic pathways and inflammatory markers. Sci Rep. 2019;9:17049.PubMedPubMedCentralCrossRef
37.
go back to reference Rittenhouse KJ, Vwalika B, Keil A, Winston J, Stoner M, Price JT, et al. Improving preterm newborn identification in low-resource settings with machine learning. PLoS One. 2019;14:e0198919.PubMedPubMedCentralCrossRef Rittenhouse KJ, Vwalika B, Keil A, Winston J, Stoner M, Price JT, et al. Improving preterm newborn identification in low-resource settings with machine learning. PLoS One. 2019;14:e0198919.PubMedPubMedCentralCrossRef
38.
go back to reference Fergus P, Cheung P, Hussain A, Al-Jumeily D, Dobbins C, Iram S. Prediction of preterm deliveries from EHG signals using machine learning. PLoS One. 2013;8:e77154.PubMedPubMedCentralCrossRef Fergus P, Cheung P, Hussain A, Al-Jumeily D, Dobbins C, Iram S. Prediction of preterm deliveries from EHG signals using machine learning. PLoS One. 2013;8:e77154.PubMedPubMedCentralCrossRef
39.
go back to reference Chen T, Guestrin C. Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International conference on knowledge discovery and data mining. 2016. p. 785–94. Chen T, Guestrin C. Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International conference on knowledge discovery and data mining. 2016. p. 785–94.
41.
go back to reference Corey KM, Kashyap S, Lorenzi E, Lagoo-Deenadayalan SA, Heller K, Whalen K, et al. Development and validation of machine learning models to identify high-risk surgical patients using automatically curated electronic health record data (Pythia): a retrospective, single-site study. PLoS Med. 2018;15:e1002701.PubMedPubMedCentralCrossRef Corey KM, Kashyap S, Lorenzi E, Lagoo-Deenadayalan SA, Heller K, Whalen K, et al. Development and validation of machine learning models to identify high-risk surgical patients using automatically curated electronic health record data (Pythia): a retrospective, single-site study. PLoS Med. 2018;15:e1002701.PubMedPubMedCentralCrossRef
42.
go back to reference Jing L, Cerna AEU, Good CW, Sauers NM, Schneider G, Hartzel DN, et al. A machine learning approach to management of heart failure populations. Jacc Hear Fail. 2020;8:578–87.CrossRef Jing L, Cerna AEU, Good CW, Sauers NM, Schneider G, Hartzel DN, et al. A machine learning approach to management of heart failure populations. Jacc Hear Fail. 2020;8:578–87.CrossRef
43.
go back to reference Carter J, Seed PT, Watson HA, David AL, Sandall J, Shennan AH, et al. Development and validation of predictive models for QUiPP App v.2: tool for predicting preterm birth in women with symptoms of threatened preterm labor. Ultrasound Obstet Gynecol. 2020;55:357–67.PubMedCrossRef Carter J, Seed PT, Watson HA, David AL, Sandall J, Shennan AH, et al. Development and validation of predictive models for QUiPP App v.2: tool for predicting preterm birth in women with symptoms of threatened preterm labor. Ultrasound Obstet Gynecol. 2020;55:357–67.PubMedCrossRef
44.
go back to reference Vogel JP, Chawanpaiboon S, Moller A-B, Watananirun K, Bonet M, Lumbiganon P. The global epidemiology of preterm birth. Best Pract Res Cl Ob. 2018;52:3–12.CrossRef Vogel JP, Chawanpaiboon S, Moller A-B, Watananirun K, Bonet M, Lumbiganon P. The global epidemiology of preterm birth. Best Pract Res Cl Ob. 2018;52:3–12.CrossRef
45.
go back to reference Smith GCS, Pell JP. Teenage pregnancy and risk of adverse perinatal outcomes associated with first and second births: population based retrospective cohort study. Obstet Gynecol Surv. 2002;57:136–7.CrossRef Smith GCS, Pell JP. Teenage pregnancy and risk of adverse perinatal outcomes associated with first and second births: population based retrospective cohort study. Obstet Gynecol Surv. 2002;57:136–7.CrossRef
46.
go back to reference Waldenström U, Aasheim V, Nilsen ABV, Rasmussen S, Pettersson HJ, Schytt E, et al. Adverse pregnancy outcomes related to advanced maternal age compared with smoking and being overweight. Obstet Gynecol. 2014;123:104–12.PubMedCrossRef Waldenström U, Aasheim V, Nilsen ABV, Rasmussen S, Pettersson HJ, Schytt E, et al. Adverse pregnancy outcomes related to advanced maternal age compared with smoking and being overweight. Obstet Gynecol. 2014;123:104–12.PubMedCrossRef
47.
go back to reference Carolan M. Maternal age ≥45 years and maternal and perinatal outcomes: a review of the evidence. Midwifery. 2013;29:479–89.PubMedCrossRef Carolan M. Maternal age ≥45 years and maternal and perinatal outcomes: a review of the evidence. Midwifery. 2013;29:479–89.PubMedCrossRef
48.
go back to reference Ray JG, Vermeulen MJ, Shapiro JL, Kenshole AB. Maternal and neonatal outcomes in pregestational and gestational diabetes mellitus, and the influence of maternal obesity and weight gain: the DEPOSIT study. Qjm Int J Med. 2001;94:347–56.CrossRef Ray JG, Vermeulen MJ, Shapiro JL, Kenshole AB. Maternal and neonatal outcomes in pregestational and gestational diabetes mellitus, and the influence of maternal obesity and weight gain: the DEPOSIT study. Qjm Int J Med. 2001;94:347–56.CrossRef
49.
go back to reference Whiteman V, Salinas A, Weldeselasse HE, August EM, Mbah AK, Aliyu MH, et al. Impact of sickle cell disease and thalassemias in infants on birth outcomes. Eur J Obstet Gyn R B. 2013;170:324–8.CrossRef Whiteman V, Salinas A, Weldeselasse HE, August EM, Mbah AK, Aliyu MH, et al. Impact of sickle cell disease and thalassemias in infants on birth outcomes. Eur J Obstet Gyn R B. 2013;170:324–8.CrossRef
50.
go back to reference Umesawa M, Kobashi G. Epidemiology of hypertensive disorders in pregnancy: prevalence, risk factors, predictors and prognosis. Hypertens Res. 2017;40:213–20.PubMedCrossRef Umesawa M, Kobashi G. Epidemiology of hypertensive disorders in pregnancy: prevalence, risk factors, predictors and prognosis. Hypertens Res. 2017;40:213–20.PubMedCrossRef
51.
go back to reference Koullali B, Oudijk MA, Nijman TAJ, Mol BWJ, Pajkrt E. Risk assessment and management to prevent preterm birth. Semin Fetal Neonatal Med. 2016;21:80–8.PubMedCrossRef Koullali B, Oudijk MA, Nijman TAJ, Mol BWJ, Pajkrt E. Risk assessment and management to prevent preterm birth. Semin Fetal Neonatal Med. 2016;21:80–8.PubMedCrossRef
52.
go back to reference Lundberg SM, Lee S-I. A unified approach to interpreting model predictions. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, et al., editors. Advances in Neural Information Processing Systems 30: Curran Associates, Inc.; 2017. p. 4765–74. Lundberg SM, Lee S-I. A unified approach to interpreting model predictions. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, et al., editors. Advances in Neural Information Processing Systems 30: Curran Associates, Inc.; 2017. p. 4765–74.
53.
go back to reference Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, et al. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell. 2020;2:56–67.PubMedPubMedCentralCrossRef Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, et al. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell. 2020;2:56–67.PubMedPubMedCentralCrossRef
54.
go back to reference Davis J, Goadrich M. The relationship between Precision-Recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine learning. 2006. p. 233–24 . Davis J, Goadrich M. The relationship between Precision-Recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine learning. 2006. p. 233–24 .
55.
go back to reference Lundberg SM, Nair B, Vavilala MS, Horibe M, Eisses MJ, Adams T, et al. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nat Biomed Eng. 2018;2:749–60.PubMedPubMedCentralCrossRef Lundberg SM, Nair B, Vavilala MS, Horibe M, Eisses MJ, Adams T, et al. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nat Biomed Eng. 2018;2:749–60.PubMedPubMedCentralCrossRef
56.
go back to reference Creanga AA, Berg CJ, Syverson C, Seed K, Bruce FC, Callaghan WM. Pregnancy-related mortality in the United States, 2006–2010. Obstet Gynecol. 2015;125:5–12.PubMedCrossRef Creanga AA, Berg CJ, Syverson C, Seed K, Bruce FC, Callaghan WM. Pregnancy-related mortality in the United States, 2006–2010. Obstet Gynecol. 2015;125:5–12.PubMedCrossRef
57.
go back to reference Hirshberg A, Srinivas SK. Epidemiology of maternal morbidity and mortality. Semin Perinatol. 2017;41:332–7.PubMedCrossRef Hirshberg A, Srinivas SK. Epidemiology of maternal morbidity and mortality. Semin Perinatol. 2017;41:332–7.PubMedCrossRef
58.
go back to reference Kopitar L, Kocbek P, Cilar L, Sheikh A, Stiglic G. Early detection of type 2 diabetes mellitus using machine learning-based prediction models. Sci Rep-Uk. 2020;10:11981.CrossRef Kopitar L, Kocbek P, Cilar L, Sheikh A, Stiglic G. Early detection of type 2 diabetes mellitus using machine learning-based prediction models. Sci Rep-Uk. 2020;10:11981.CrossRef
59.
go back to reference Yan L, Zhang H-T, Goncalves J, Xiao Y, Wang M, Guo Y, et al. An interpretable mortality prediction model for COVID-19 patients. Nat Mach Intell. 2020;2:283–8.CrossRef Yan L, Zhang H-T, Goncalves J, Xiao Y, Wang M, Guo Y, et al. An interpretable mortality prediction model for COVID-19 patients. Nat Mach Intell. 2020;2:283–8.CrossRef
60.
61.
go back to reference Gao C, Osmundson S, Edwards DRV, Jackson GP, Malin BA, Chen Y. Deep learning predicts extreme preterm birth from electronic health records. J Biomed Inform. 2019;100:103334.PubMedPubMedCentralCrossRef Gao C, Osmundson S, Edwards DRV, Jackson GP, Malin BA, Chen Y. Deep learning predicts extreme preterm birth from electronic health records. J Biomed Inform. 2019;100:103334.PubMedPubMedCentralCrossRef
62.
go back to reference Torchin H, Ancel P-Y. Epidemiology and risk factors of preterm birth. J De Gynecol Obstetrique Et Biologie De La Reprod. 2016;45:1213–30.CrossRef Torchin H, Ancel P-Y. Epidemiology and risk factors of preterm birth. J De Gynecol Obstetrique Et Biologie De La Reprod. 2016;45:1213–30.CrossRef
63.
go back to reference Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25:44–56.PubMedCrossRef Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25:44–56.PubMedCrossRef
64.
go back to reference He J, Baxter SL, Xu J, Xu J, Zhou X, Zhang K. The practical implementation of artificial intelligence technologies in medicine. Nat Med. 2019;25:30–6.PubMedPubMedCentralCrossRef He J, Baxter SL, Xu J, Xu J, Zhou X, Zhang K. The practical implementation of artificial intelligence technologies in medicine. Nat Med. 2019;25:30–6.PubMedPubMedCentralCrossRef
65.
go back to reference Esplin MS. The importance of clinical phenotype in understanding and preventing spontaneous preterm birth. Am J Perinatol. 2016;33:236–44.PubMedCrossRef Esplin MS. The importance of clinical phenotype in understanding and preventing spontaneous preterm birth. Am J Perinatol. 2016;33:236–44.PubMedCrossRef
66.
go back to reference Manuck TA, Esplin MS, Biggio J, Bukowski R, Parry S, Zhang H, et al. The phenotype of spontaneous preterm birth: application of a clinical phenotyping tool. Am J Obstet Gynecol. 2015;212:487.e1–487.e11.CrossRef Manuck TA, Esplin MS, Biggio J, Bukowski R, Parry S, Zhang H, et al. The phenotype of spontaneous preterm birth: application of a clinical phenotyping tool. Am J Obstet Gynecol. 2015;212:487.e1–487.e11.CrossRef
67.
go back to reference Phelan M, Bhavsar NA, Goldstein BA. Illustrating informed presence bias in electronic health records data: how patient interactions with a health system can impact inference. Egems Wash Dc. 2017;5:22.PubMedPubMedCentral Phelan M, Bhavsar NA, Goldstein BA. Illustrating informed presence bias in electronic health records data: how patient interactions with a health system can impact inference. Egems Wash Dc. 2017;5:22.PubMedPubMedCentral
69.
go back to reference Kukhareva PV, Caverly TJ, Li H, Katki HA, Cheung LC, Reese TJ, et al. Inaccuracies in electronic health records smoking data and a potential approach to address resulting underestimation in determining lung cancer screening eligibility. J Am Med Inform Assoc. 2022. https://doi.org/10.1093/jamia/ocac020. Kukhareva PV, Caverly TJ, Li H, Katki HA, Cheung LC, Reese TJ, et al. Inaccuracies in electronic health records smoking data and a potential approach to address resulting underestimation in determining lung cancer screening eligibility. J Am Med Inform Assoc. 2022. https://​doi.​org/​10.​1093/​jamia/​ocac020.
70.
go back to reference Garies S, Cummings M, Quan H, McBrien K, Drummond N, Manca D, et al. Methods to improve the quality of smoking records in a primary care EMR database: exploring multiple imputation and pattern-matching algorithms. Bmc Med Inform Decis. 2020;20:56.CrossRef Garies S, Cummings M, Quan H, McBrien K, Drummond N, Manca D, et al. Methods to improve the quality of smoking records in a primary care EMR database: exploring multiple imputation and pattern-matching algorithms. Bmc Med Inform Decis. 2020;20:56.CrossRef
71.
72.
go back to reference Phillips C, Velji Z, Hanly C, Metcalfe A. Risk of recurrent spontaneous preterm birth: a systematic review and meta-analysis. BMJ Open. 2017;7:e015402.PubMedPubMedCentralCrossRef Phillips C, Velji Z, Hanly C, Metcalfe A. Risk of recurrent spontaneous preterm birth: a systematic review and meta-analysis. BMJ Open. 2017;7:e015402.PubMedPubMedCentralCrossRef
73.
go back to reference Shah NH, Milstein A, Bagley SC. Making machine learning models clinically useful. JAMA. 2019;322:1351–2.PubMedCrossRef Shah NH, Milstein A, Bagley SC. Making machine learning models clinically useful. JAMA. 2019;322:1351–2.PubMedCrossRef
74.
go back to reference Gianfrancesco MA, Tamang S, Yazdany J, Schmajuk G. Potential biases in machine learning algorithms using electronic health record data. JAMA Intern Med. 2018;178:1544.PubMedPubMedCentralCrossRef Gianfrancesco MA, Tamang S, Yazdany J, Schmajuk G. Potential biases in machine learning algorithms using electronic health record data. JAMA Intern Med. 2018;178:1544.PubMedPubMedCentralCrossRef
75.
go back to reference Weng C, Shah N, Hripcsak G. Deep phenotyping: embracing complexity and temporality—towards scalability, portability, and interoperability. J Biomed Inform. 2020;105:103433.PubMedPubMedCentralCrossRef Weng C, Shah N, Hripcsak G. Deep phenotyping: embracing complexity and temporality—towards scalability, portability, and interoperability. J Biomed Inform. 2020;105:103433.PubMedPubMedCentralCrossRef
76.
go back to reference Bergstra J, Yamins D, Cox D. Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In: International conference on machine learning; 2013. p. 115–23. Bergstra J, Yamins D, Cox D. Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In: International conference on machine learning; 2013. p. 115–23.
77.
go back to reference Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825–30. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825–30.
78.
go back to reference Soysal E, Wang J, Jiang M, Wu Y, Pakhomov S, Liu H, et al. CLAMP – a toolkit for efficiently building customized clinical natural language processing pipelines. J Am Med Inform Assn. 2017;25:331–6.CrossRef Soysal E, Wang J, Jiang M, Wu Y, Pakhomov S, Liu H, et al. CLAMP – a toolkit for efficiently building customized clinical natural language processing pipelines. J Am Med Inform Assn. 2017;25:331–6.CrossRef
79.
go back to reference Marees AT, de Kluiver H, Stringer S, Vorspan F, Curis E, Marie‐Claire C, et al. A tutorial on conducting genome‐wide association studies: Quality control and statistical analysis. Int J Methods Psychiatr Res. 2018;27(2):e1608. Marees AT, de Kluiver H, Stringer S, Vorspan F, Curis E, Marie‐Claire C, et al. A tutorial on conducting genome‐wide association studies: Quality control and statistical analysis. Int J Methods Psychiatr Res. 2018;27(2):e1608.
80.
go back to reference Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. 2015;4:7.PubMedPubMedCentralCrossRef Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. 2015;4:7.PubMedPubMedCentralCrossRef
81.
go back to reference Euesden J, Lewis CM, O’Reilly PF. PRSice: polygenic risk score software. Bioinformatics. 2015;31:1466–8.PubMedCrossRef Euesden J, Lewis CM, O’Reilly PF. PRSice: polygenic risk score software. Bioinformatics. 2015;31:1466–8.PubMedCrossRef
82.
go back to reference Choi SW, O’Reilly PF. PRSice-2: polygenic risk score software for biobank-scale data. Gigascience. 2019:giz082. Choi SW, O’Reilly PF. PRSice-2: polygenic risk score software for biobank-scale data. Gigascience. 2019:giz082.
83.
go back to reference McInnes L, Healy J, Melville J. UMAP: Uniform manifold approximation and projection for dimension reduction. arXiv preprint. 2018;arXiv:1802.03426. McInnes L, Healy J, Melville J. UMAP: Uniform manifold approximation and projection for dimension reduction. arXiv preprint. 2018;arXiv:1802.03426.
84.
go back to reference McInnes L, Healy J, Astels S. hdbscan: Hierarchical density based clustering. J Open Source Softw. 2017;2(11):205. McInnes L, Healy J, Astels S. hdbscan: Hierarchical density based clustering. J Open Source Softw. 2017;2(11):205.
85.
go back to reference Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods. 2020;17:261–72.PubMedPubMedCentralCrossRef Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods. 2020;17:261–72.PubMedPubMedCentralCrossRef
Metadata
Title
Dense phenotyping from electronic health records enables machine learning-based prediction of preterm birth
Authors
Abin Abraham
Brian Le
Idit Kosti
Peter Straub
Digna R. Velez-Edwards
Lea K. Davis
J. M. Newton
Louis J. Muglia
Antonis Rokas
Cosmin A. Bejan
Marina Sirota
John A. Capra
Publication date
01-12-2022
Publisher
BioMed Central
Keyword
Premature Birth
Published in
BMC Medicine / Issue 1/2022
Electronic ISSN: 1741-7015
DOI
https://doi.org/10.1186/s12916-022-02522-x

Other articles of this Issue 1/2022

BMC Medicine 1/2022 Go to the issue