Skip to main content
Top
Published in: Prevention Science 3/2019

01-04-2019

Non-Gaussian Methods for Causal Structure Learning

Author: Shohei Shimizu

Published in: Prevention Science | Issue 3/2019

Login to get access

Abstract

Causal structure learning is one of the most exciting new topics in the fields of machine learning and statistics. In many empirical sciences including prevention science, the causal mechanisms underlying various phenomena need to be studied. Nevertheless, in many cases, classical methods for causal structure learning are not capable of estimating the causal structure of variables. This is because it explicitly or implicitly assumes Gaussianity of data and typically utilizes only the covariance structure. In many applications, however, non-Gaussian data are often obtained, which means that more information may be contained in the data distribution than the covariance matrix is capable of containing. Thus, many new methods have recently been proposed for using the non-Gaussian structure of data and inferring the causal structure of variables. This paper introduces prevention scientists to such causal structure learning methods, particularly those based on the linear, non-Gaussian, acyclic model known as LiNGAM. These non-Gaussian data analysis tools can fully estimate the underlying causal structures of variables under assumptions even in the presence of unobserved common causes. This feature is in contrast to other approaches. A simulated example is also provided.
Appendix
Available only for authorised users
Footnotes
1
These structural equations simply describe the data-generating processes and may be designed without the concept of causality.
 
2
Conditional independence-based approaches can also handle unobserved common causes, but their results usually contain many causal directed acyclic graphs, e.g., see the FCI algorithm (Spirtes et al. 1993).
 
3
Python codes written by Taku Yoshioka are freely available at https://​github.​com/​taku-y/​bmlingam
 
Literature
go back to reference Bach, F.R., & Jordan, M.I. (2002). Kernel independent component analysis. Journal of Machine Learning Research, 3, 1–48. Bach, F.R., & Jordan, M.I. (2002). Kernel independent component analysis. Journal of Machine Learning Research, 3, 1–48.
go back to reference Billingsley, P. (1986). Probability and measure. New York: Wiley-Interscience. Billingsley, P. (1986). Probability and measure. New York: Wiley-Interscience.
go back to reference Bollen, K. (1989). Structural equations with latent variables. New York: Wiley.CrossRef Bollen, K. (1989). Structural equations with latent variables. New York: Wiley.CrossRef
go back to reference Darmois, G. (1953). Analyse générale des liaisons stochastiques. Review of the International Statistical Institute, 21, 2–8.CrossRef Darmois, G. (1953). Analyse générale des liaisons stochastiques. Review of the International Statistical Institute, 21, 2–8.CrossRef
go back to reference Demidenko, E. (2004). Mixed models: Theory and applications. New York: Wiley-Interscience.CrossRef Demidenko, E. (2004). Mixed models: Theory and applications. New York: Wiley-Interscience.CrossRef
go back to reference Gretton, A., Bousquet, O., Smola, A.J., Schölkopf, B. (2005). Measuring statistical dependence with Hilbert-Schmidt norms. In Proceedings of 16th international conference on algorithmic learning theory (ALT2005) (pp. 63–77). Gretton, A., Bousquet, O., Smola, A.J., Schölkopf, B. (2005). Measuring statistical dependence with Hilbert-Schmidt norms. In Proceedings of 16th international conference on algorithmic learning theory (ALT2005) (pp. 63–77).
go back to reference Hoyer, P.O., Shimizu, S., Kerminen, A., Palviainen, M. (2008). Estimation of causal effects using linear non-Gaussian causal models with hidden variables. International Journal of Approximate Reasoning, 49, 362–378.CrossRef Hoyer, P.O., Shimizu, S., Kerminen, A., Palviainen, M. (2008). Estimation of causal effects using linear non-Gaussian causal models with hidden variables. International Journal of Approximate Reasoning, 49, 362–378.CrossRef
go back to reference Hoyer, P.O., Janzing, D., Mooij, J., Peters, J., Schölkopf, B. (2009). Nonlinear causal discovery with additive noise models. Advances in Neural Information Processing Systems, 21, 689–696. Hoyer, P.O., Janzing, D., Mooij, J., Peters, J., Schölkopf, B. (2009). Nonlinear causal discovery with additive noise models. Advances in Neural Information Processing Systems, 21, 689–696.
go back to reference Hyvärinen, A., Karhunen, J., Oja, E. (2001). Independent component analysis. New York: Wiley.CrossRef Hyvärinen, A., Karhunen, J., Oja, E. (2001). Independent component analysis. New York: Wiley.CrossRef
go back to reference Hyvärinen, A., Zhang, K., Shimizu, S., Hoyer, P. (2010). Estimation of a structural vector autoregression model using non-Gaussianity. Journal of Machine Learning Research, 11, 1709–1731. Hyvärinen, A., Zhang, K., Shimizu, S., Hoyer, P. (2010). Estimation of a structural vector autoregression model using non-Gaussianity. Journal of Machine Learning Research, 11, 1709–1731.
go back to reference Imbens, G.W., & Rubin, D.B. (2015). Causal inference in statistics, social, and biomedical sciences. Cambridge: Cambridge University Press.CrossRef Imbens, G.W., & Rubin, D.B. (2015). Causal inference in statistics, social, and biomedical sciences. Cambridge: Cambridge University Press.CrossRef
go back to reference Kass, R.E., & Raftery, A.E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773–795.CrossRef Kass, R.E., & Raftery, A.E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773–795.CrossRef
go back to reference Kraskov, A., Stögbauer, H., Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69, 066138.CrossRef Kraskov, A., Stögbauer, H., Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69, 066138.CrossRef
go back to reference Lacerda, G., Spirtes, P., Ramsey, J., Hoyer, P.O. (2008). Discovering cyclic causal models by independent components analysis. In Proceedings of the 24th conference on uncertainty in artificial intelligence (UAI2008) (pp. 366–374). Lacerda, G., Spirtes, P., Ramsey, J., Hoyer, P.O. (2008). Discovering cyclic causal models by independent components analysis. In Proceedings of the 24th conference on uncertainty in artificial intelligence (UAI2008) (pp. 366–374).
go back to reference Mills-Finnerty, C., Hanson, C., Hanson, S.J. (2014). Brain network response underlying decisions about abstract reinforcers. NeuroImage, 103, 48–54.CrossRefPubMed Mills-Finnerty, C., Hanson, C., Hanson, S.J. (2014). Brain network response underlying decisions about abstract reinforcers. NeuroImage, 103, 48–54.CrossRefPubMed
go back to reference Moneta, A., Entner, D., Hoyer, P., Coad, A. (2013). Causal inference by independent component analysis: theory and applications. Oxford Bulletin of Economics and Statistics, 75, 705–730.CrossRef Moneta, A., Entner, D., Hoyer, P., Coad, A. (2013). Causal inference by independent component analysis: theory and applications. Oxford Bulletin of Economics and Statistics, 75, 705–730.CrossRef
go back to reference Pearl, J. (2000). Causality: models, reasoning, and inference. Cambridge: Cambridge University Press. Pearl, J. (2000). Causality: models, reasoning, and inference. Cambridge: Cambridge University Press.
go back to reference Pearl, J., & Verma, T. (1991). A theory of inferred causation. In Allen, J., Fikes, R., Sandewall, E. (Eds.) Proceedings of the 2nd international conference on principles of knowledge representation and reasoning (pp. 441–452). San Mateo: Morgan Kaufmann. Pearl, J., & Verma, T. (1991). A theory of inferred causation. In Allen, J., Fikes, R., Sandewall, E. (Eds.) Proceedings of the 2nd international conference on principles of knowledge representation and reasoning (pp. 441–452). San Mateo: Morgan Kaufmann.
go back to reference Raitakari, O.T., Juonala, M., Rönnemaa, T., Keltikangas-Järvinen, L., Räsänen, L., Pietikäinen, M., Hutri-Kähönen, N., Taittonen, L., Jokinen, E., Marniemi, J., et al. (2008). Cohort profile: The cardiovascular risk in young finns study. International Journal of Epidemiology, 37, 1220–1226.CrossRefPubMed Raitakari, O.T., Juonala, M., Rönnemaa, T., Keltikangas-Järvinen, L., Räsänen, L., Pietikäinen, M., Hutri-Kähönen, N., Taittonen, L., Jokinen, E., Marniemi, J., et al. (2008). Cohort profile: The cardiovascular risk in young finns study. International Journal of Epidemiology, 37, 1220–1226.CrossRefPubMed
go back to reference Rosenström, T., Jokela, M., Puttonen, S., Hintsanen, M., Pulkki-Råback, L., Viikari, J.S., Raitakari, O.T., Keltikangas-Järvinen, L. (2012). Pairwise measures of causal direction in the epidemiology of sleep problems and depression. PLoS ONE, 7, e50841.CrossRefPubMedPubMedCentral Rosenström, T., Jokela, M., Puttonen, S., Hintsanen, M., Pulkki-Råback, L., Viikari, J.S., Raitakari, O.T., Keltikangas-Järvinen, L. (2012). Pairwise measures of causal direction in the epidemiology of sleep problems and depression. PLoS ONE, 7, e50841.CrossRefPubMedPubMedCentral
go back to reference Rubin, D.B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66, 688–701.CrossRef Rubin, D.B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66, 688–701.CrossRef
go back to reference Shimizu, S. (2014). LiNGAM: Non-gaussian methods for estimating causal structures. Behaviormetrika, 41, 65–98.CrossRef Shimizu, S. (2014). LiNGAM: Non-gaussian methods for estimating causal structures. Behaviormetrika, 41, 65–98.CrossRef
go back to reference Shimizu, S., & Bollen, K. (2014). Bayesian estimation of causal direction in acyclic structural equation models with individual-specific confounder variables and non-Gaussian distributions. Journal of Machine Learning Research, 15, 2629–2652.PubMed Shimizu, S., & Bollen, K. (2014). Bayesian estimation of causal direction in acyclic structural equation models with individual-specific confounder variables and non-Gaussian distributions. Journal of Machine Learning Research, 15, 2629–2652.PubMed
go back to reference Shimizu, S., Hoyer, P.O., Hyvärinen, A., Kerminen, A. (2006). A linear non-Gaussian acyclic model for causal discovery. Journal of Machine Learning Research, 7, 2003–2030. Shimizu, S., Hoyer, P.O., Hyvärinen, A., Kerminen, A. (2006). A linear non-Gaussian acyclic model for causal discovery. Journal of Machine Learning Research, 7, 2003–2030.
go back to reference Shimizu, S., Inazumi, T., Sogawa, Y., Hyvärinen, A., Kawahara, Y., Washio, T., Hoyer, P.O., Bollen, K. (2011). DirectLiNGAM: A direct method for learning a linear non-Gaussian structural equation model. Journal of Machine Learning Research, 12, 1225–1248. Shimizu, S., Inazumi, T., Sogawa, Y., Hyvärinen, A., Kawahara, Y., Washio, T., Hoyer, P.O., Bollen, K. (2011). DirectLiNGAM: A direct method for learning a linear non-Gaussian structural equation model. Journal of Machine Learning Research, 12, 1225–1248.
go back to reference Skitovitch, W.P. (1953). On a property of the normal distribution. Doklady Akademii Nauk SSSR, 89, 217–219. Skitovitch, W.P. (1953). On a property of the normal distribution. Doklady Akademii Nauk SSSR, 89, 217–219.
go back to reference Spirtes, P., Glymour, C., Scheines, R. (1993). Causation, prediction, and search. Berlin: Springer. (2nd edn. MIT Press 2000).CrossRef Spirtes, P., Glymour, C., Scheines, R. (1993). Causation, prediction, and search. Berlin: Springer. (2nd edn. MIT Press 2000).CrossRef
go back to reference Zhang, K., & Chan, L. (2008). Minimal nonlinear distortion principle for nonlinear independent component analysis. Journal of Machine Learning Research, 9, 2455–2487. Zhang, K., & Chan, L. (2008). Minimal nonlinear distortion principle for nonlinear independent component analysis. Journal of Machine Learning Research, 9, 2455–2487.
go back to reference Zhang, K., & Hyvärinen, A. (2009). On the identifiability of the post-nonlinear causal model. In Proceedings of the 25th conference on uncertainty in artificial intelligence (UAI2009) (pp. 647–655). Zhang, K., & Hyvärinen, A. (2009). On the identifiability of the post-nonlinear causal model. In Proceedings of the 25th conference on uncertainty in artificial intelligence (UAI2009) (pp. 647–655).
go back to reference Zhang, K., & Hyvärinen, A. (2016). Nonlinear functional causal models for distinguishing causes form effect. In Wiedermann, W., & von Eye, A. (Eds.) Statistics and causality: methods for applied empirical research. Wiley. Zhang, K., & Hyvärinen, A. (2016). Nonlinear functional causal models for distinguishing causes form effect. In Wiedermann, W., & von Eye, A. (Eds.) Statistics and causality: methods for applied empirical research. Wiley.
Metadata
Title
Non-Gaussian Methods for Causal Structure Learning
Author
Shohei Shimizu
Publication date
01-04-2019
Publisher
Springer US
Published in
Prevention Science / Issue 3/2019
Print ISSN: 1389-4986
Electronic ISSN: 1573-6695
DOI
https://doi.org/10.1007/s11121-018-0901-x

Other articles of this Issue 3/2019

Prevention Science 3/2019 Go to the issue