Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2020

Open Access 01-12-2020 | Alzheimer's Disease | Technical advance

Directed acyclic graphs and causal thinking in clinical risk prediction modeling

Authors: Marco Piccininni, Stefan Konigorski, Jessica L. Rohmann, Tobias Kurth

Published in: BMC Medical Research Methodology | Issue 1/2020

Login to get access

Abstract

Background

In epidemiology, causal inference and prediction modeling methodologies have been historically distinct. Directed Acyclic Graphs (DAGs) are used to model a priori causal assumptions and inform variable selection strategies for causal questions. Although tools originally designed for prediction are finding applications in causal inference, the counterpart has remained largely unexplored. The aim of this theoretical and simulation-based study is to assess the potential benefit of using DAGs in clinical risk prediction modeling.

Methods

We explore how incorporating knowledge about the underlying causal structure can provide insights about the transportability of diagnostic clinical risk prediction models to different settings. We further probe whether causal knowledge can be used to improve predictor selection in clinical risk prediction models.

Results

A single-predictor model in the causal direction is likely to have better transportability than one in the anticausal direction in some scenarios. We empirically show that the Markov Blanket, the set of variables including the parents, children, and parents of the children of the outcome node in a DAG, is the optimal set of predictors for that outcome.

Conclusions

Our findings provide a theoretical basis for the intuition that a diagnostic clinical risk prediction model including causes as predictors is likely to be more transportable. Furthermore, using DAGs to identify Markov Blanket variables may be a useful, efficient strategy to select predictors in clinical risk prediction models if strong knowledge of the underlying causal structure exists or can be learned.
Appendix
Available only for authorised users
Literature
1.
go back to reference Peters J, Janzing D, Schölkopf B. Elements of Causal Inference: Foundations and Learning Algorithms. Cambridge: MIT Press; 2017. Peters J, Janzing D, Schölkopf B. Elements of Causal Inference: Foundations and Learning Algorithms. Cambridge: MIT Press; 2017.
3.
go back to reference Greenland S, Pearl J, Robins JM. Causal diagrams for epidemiologic research. Epidemiology. 1999;10:37–48.CrossRef Greenland S, Pearl J, Robins JM. Causal diagrams for epidemiologic research. Epidemiology. 1999;10:37–48.CrossRef
5.
go back to reference Hernán MA, Hernández-Díaz S, Werler MM, Mitchell AA. Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. Am J Epidemiol. 2002;155:176–84.CrossRef Hernán MA, Hernández-Díaz S, Werler MM, Mitchell AA. Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. Am J Epidemiol. 2002;155:176–84.CrossRef
6.
go back to reference Janzing D, Schölkopf B. Causal inference using the algorithmic Markov condition. IEEE Trans Inf Theory. 2010;56:5168–94.CrossRef Janzing D, Schölkopf B. Causal inference using the algorithmic Markov condition. IEEE Trans Inf Theory. 2010;56:5168–94.CrossRef
7.
go back to reference Schölkopf B, Janzing D, Peters J, Sgouritsa E, Zhang K, Mooij J. On Causal and Anticausal Learning. arXiv [cs. LG]; 2012. Schölkopf B, Janzing D, Peters J, Sgouritsa E, Zhang K, Mooij J. On Causal and Anticausal Learning. arXiv [cs. LG]; 2012.
8.
go back to reference Brown LE, Tsamardinos I. Markov blanket-based variable selection in feature space. Technical Report DSL TR-08-01; 2008. Brown LE, Tsamardinos I. Markov blanket-based variable selection in feature space. Technical Report DSL TR-08-01; 2008.
9.
go back to reference Fu S, Desmarais MC. Markov blanket based feature selection: a review of past decade. In: Proceedings of the world congress on engineering. Hong Kong: Newswood Ltd; 2010;1:321–8. Fu S, Desmarais MC. Markov blanket based feature selection: a review of past decade. In: Proceedings of the world congress on engineering. Hong Kong: Newswood Ltd; 2010;1:321–8.
10.
go back to reference Elshawi R, Al-Mallah MH, Sakr S. On the interpretability of machine learning-based model for predicting hypertension. BMC Med Inform Decis Mak. 2019;19:146.CrossRef Elshawi R, Al-Mallah MH, Sakr S. On the interpretability of machine learning-based model for predicting hypertension. BMC Med Inform Decis Mak. 2019;19:146.CrossRef
11.
go back to reference Koller D, Sahami M. Toward Optimal Feature Selection. In: ICML’96 Proceedings of the Thirteenth International Conference on International Conference on Machine Learning; 1996. p. 284–92. Koller D, Sahami M. Toward Optimal Feature Selection. In: ICML’96 Proceedings of the Thirteenth International Conference on International Conference on Machine Learning; 1996. p. 284–92.
12.
go back to reference Pearl J. Probabilistic reasoning in intelligent systems: networks of plausible inference. San Francisco: Morgan Kaufmann; 1988. Pearl J. Probabilistic reasoning in intelligent systems: networks of plausible inference. San Francisco: Morgan Kaufmann; 1988.
13.
go back to reference Yaramakala S, Margaritis D. Speculative Markov blanket discovery for optimal feature selection. In: Fifth IEEE International Conference on Data Mining (ICDM’05); 2005. Yaramakala S, Margaritis D. Speculative Markov blanket discovery for optimal feature selection. In: Fifth IEEE International Conference on Data Mining (ICDM’05); 2005.
14.
go back to reference Pellet J-P, Elisseeff A. Using Markov Blankets for Causal Structure Learning. J Mach Learn Res. 2008;9:1295–342. Pellet J-P, Elisseeff A. Using Markov Blankets for Causal Structure Learning. J Mach Learn Res. 2008;9:1295–342.
15.
go back to reference Tsamardinos I, Aliferis CF, Statnikov AR, Statnikov E. Algorithms for large scale Markov blanket discovery. In: FLAIRS conference; 2003. p. 376–80. Tsamardinos I, Aliferis CF, Statnikov AR, Statnikov E. Algorithms for large scale Markov blanket discovery. In: FLAIRS conference; 2003. p. 376–80.
16.
go back to reference Kohavi R, John GH. Wrappers for feature subset selection. Artif Intell. 1997;97:273–324.CrossRef Kohavi R, John GH. Wrappers for feature subset selection. Artif Intell. 1997;97:273–324.CrossRef
17.
go back to reference Tsamardinos I, Aliferis CF. Towards principled feature selection: relevancy, filters and wrappers. AISTATS: Proceedings of the ninth International workshop on artificial intelligence and statistics; 2003. Tsamardinos I, Aliferis CF. Towards principled feature selection: relevancy, filters and wrappers. AISTATS: Proceedings of the ninth International workshop on artificial intelligence and statistics; 2003.
18.
go back to reference Yang S, Wang H, Hu X. Efficient Local Causal Discovery Based on Markov Blanket. arXiv [cs.AI]; 2019. Yang S, Wang H, Hu X. Efficient Local Causal Discovery Based on Markov Blanket. arXiv [cs.AI]; 2019.
19.
go back to reference Austin PC, Steyerberg EW. The integrated calibration index (ICI) and related metrics for quantifying the calibration of logistic regression models. Stat Med. 2019;38:4051–65. Austin PC, Steyerberg EW. The integrated calibration index (ICI) and related metrics for quantifying the calibration of logistic regression models. Stat Med. 2019;38:4051–65.
20.
go back to reference Uddin MS, Kabir MT, Al Mamun A, Abdel-Daim MM, Barreto GE, Ashraf GM. APOE and Alzheimer’s disease: evidence mounts that targeting APOE4 may combat Alzheimer’s pathogenesis. Mol Neurobiol. 2019;56:2450–65.CrossRef Uddin MS, Kabir MT, Al Mamun A, Abdel-Daim MM, Barreto GE, Ashraf GM. APOE and Alzheimer’s disease: evidence mounts that targeting APOE4 may combat Alzheimer’s pathogenesis. Mol Neurobiol. 2019;56:2450–65.CrossRef
21.
go back to reference Lee JC, Kim SJ, Hong S, Kim Y. Diagnosis of Alzheimer’s disease utilizing amyloid and tau as fluid biomarkers. Exp Mol Med. 2019;51:1–10. Lee JC, Kim SJ, Hong S, Kim Y. Diagnosis of Alzheimer’s disease utilizing amyloid and tau as fluid biomarkers. Exp Mol Med. 2019;51:1–10.
22.
go back to reference Li G, Dai H, Tu Y. Identifying Markov Blankets Using Lasso Estimation. In: Advances in Knowledge Discovery and Data Mining. Berlin Heidelberg: Springer; 2004. p. 308–18.CrossRef Li G, Dai H, Tu Y. Identifying Markov Blankets Using Lasso Estimation. In: Advances in Knowledge Discovery and Data Mining. Berlin Heidelberg: Springer; 2004. p. 308–18.CrossRef
23.
go back to reference Steyerberg EW. Clinical prediction models: a practical approach to development, validation, and updating. Second edition. Cham: Springer; 2019. Steyerberg EW. Clinical prediction models: a practical approach to development, validation, and updating. Second edition. Cham: Springer; 2019.
Metadata
Title
Directed acyclic graphs and causal thinking in clinical risk prediction modeling
Authors
Marco Piccininni
Stefan Konigorski
Jessica L. Rohmann
Tobias Kurth
Publication date
01-12-2020
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2020
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/s12874-020-01058-z

Other articles of this Issue 1/2020

BMC Medical Research Methodology 1/2020 Go to the issue