Top

Published in:

01-04-2019

Non-Gaussian Methods for Causal Structure Learning

Author: Shohei Shimizu

Published in: Prevention Science | Issue 3/2019

Abstract

Causal structure learning is one of the most exciting new topics in the fields of machine learning and statistics. In many empirical sciences including prevention science, the causal mechanisms underlying various phenomena need to be studied. Nevertheless, in many cases, classical methods for causal structure learning are not capable of estimating the causal structure of variables. This is because it explicitly or implicitly assumes Gaussianity of data and typically utilizes only the covariance structure. In many applications, however, non-Gaussian data are often obtained, which means that more information may be contained in the data distribution than the covariance matrix is capable of containing. Thus, many new methods have recently been proposed for using the non-Gaussian structure of data and inferring the causal structure of variables. This paper introduces prevention scientists to such causal structure learning methods, particularly those based on the linear, non-Gaussian, acyclic model known as LiNGAM. These non-Gaussian data analysis tools can fully estimate the underlying causal structures of variables under assumptions even in the presence of unobserved common causes. This feature is in contrast to other approaches. A simulated example is also provided.

Available only for authorised users

These structural equations simply describe the data-generating processes and may be designed without the concept of causality.

Conditional independence-based approaches can also handle unobserved common causes, but their results usually contain many causal directed acyclic graphs, e.g., see the FCI algorithm (Spirtes et al. 1993).

Python codes written by Taku Yoshioka are freely available at https://github.com/taku-y/bmlingam

Bach, F.R., & Jordan, M.I. (2002). Kernel independent component analysis. Journal of Machine Learning Research, 3, 1–48.

Billingsley, P. (1986). Probability and measure. New York: Wiley-Interscience.

Bollen, K. (1989). Structural equations with latent variables. New York: Wiley.CrossRef

Darmois, G. (1953). Analyse générale des liaisons stochastiques. Review of the International Statistical Institute, 21, 2–8.CrossRef

Demidenko, E. (2004). Mixed models: Theory and applications. New York: Wiley-Interscience.CrossRef

Gretton, A., Bousquet, O., Smola, A.J., Schölkopf, B. (2005). Measuring statistical dependence with Hilbert-Schmidt norms. In Proceedings of 16th international conference on algorithmic learning theory (ALT2005) (pp. 63–77).

Hoyer, P.O., Shimizu, S., Kerminen, A., Palviainen, M. (2008). Estimation of causal effects using linear non-Gaussian causal models with hidden variables. International Journal of Approximate Reasoning, 49, 362–378.CrossRef

Hoyer, P.O., Janzing, D., Mooij, J., Peters, J., Schölkopf, B. (2009). Nonlinear causal discovery with additive noise models. Advances in Neural Information Processing Systems, 21, 689–696.

Hyvärinen, A., Karhunen, J., Oja, E. (2001). Independent component analysis. New York: Wiley.CrossRef

Hyvärinen, A., Zhang, K., Shimizu, S., Hoyer, P. (2010). Estimation of a structural vector autoregression model using non-Gaussianity. Journal of Machine Learning Research, 11, 1709–1731.

Imbens, G.W., & Rubin, D.B. (2015). Causal inference in statistics, social, and biomedical sciences. Cambridge: Cambridge University Press.CrossRef

Kass, R.E., & Raftery, A.E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773–795.CrossRef

Kraskov, A., Stögbauer, H., Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69, 066138.CrossRef

Lacerda, G., Spirtes, P., Ramsey, J., Hoyer, P.O. (2008). Discovering cyclic causal models by independent components analysis. In Proceedings of the 24th conference on uncertainty in artificial intelligence (UAI2008) (pp. 366–374).

Mills-Finnerty, C., Hanson, C., Hanson, S.J. (2014). Brain network response underlying decisions about abstract reinforcers. NeuroImage, 103, 48–54.CrossRefPubMed

Moneta, A., Entner, D., Hoyer, P., Coad, A. (2013). Causal inference by independent component analysis: theory and applications. Oxford Bulletin of Economics and Statistics, 75, 705–730.CrossRef

Pearl, J. (2000). Causality: models, reasoning, and inference. Cambridge: Cambridge University Press.

Pearl, J., & Verma, T. (1991). A theory of inferred causation. In Allen, J., Fikes, R., Sandewall, E. (Eds.) Proceedings of the 2nd international conference on principles of knowledge representation and reasoning (pp. 441–452). San Mateo: Morgan Kaufmann.

Raitakari, O.T., Juonala, M., Rönnemaa, T., Keltikangas-Järvinen, L., Räsänen, L., Pietikäinen, M., Hutri-Kähönen, N., Taittonen, L., Jokinen, E., Marniemi, J., et al. (2008). Cohort profile: The cardiovascular risk in young finns study. International Journal of Epidemiology, 37, 1220–1226.CrossRefPubMed

Rosenström, T., Jokela, M., Puttonen, S., Hintsanen, M., Pulkki-Råback, L., Viikari, J.S., Raitakari, O.T., Keltikangas-Järvinen, L. (2012). Pairwise measures of causal direction in the epidemiology of sleep problems and depression. PLoS ONE, 7, e50841.CrossRefPubMedPubMedCentral

Rubin, D.B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66, 688–701.CrossRef

Shimizu, S. (2014). LiNGAM: Non-gaussian methods for estimating causal structures. Behaviormetrika, 41, 65–98.CrossRef

Shimizu, S., & Bollen, K. (2014). Bayesian estimation of causal direction in acyclic structural equation models with individual-specific confounder variables and non-Gaussian distributions. Journal of Machine Learning Research, 15, 2629–2652.PubMed

Shimizu, S., Hoyer, P.O., Hyvärinen, A., Kerminen, A. (2006). A linear non-Gaussian acyclic model for causal discovery. Journal of Machine Learning Research, 7, 2003–2030.

Shimizu, S., Inazumi, T., Sogawa, Y., Hyvärinen, A., Kawahara, Y., Washio, T., Hoyer, P.O., Bollen, K. (2011). DirectLiNGAM: A direct method for learning a linear non-Gaussian structural equation model. Journal of Machine Learning Research, 12, 1225–1248.

Skitovitch, W.P. (1953). On a property of the normal distribution. Doklady Akademii Nauk SSSR, 89, 217–219.

Spirtes, P., & Zhang, K. (2016). Causal discovery and inference: concepts and recent methodological advances. Applied Informatics, 3. https://doi.org/10.1186/s40535-016-0018-x.

Spirtes, P., Glymour, C., Scheines, R. (1993). Causation, prediction, and search. Berlin: Springer. (2nd edn. MIT Press 2000).CrossRef

Zhang, K., & Chan, L. (2008). Minimal nonlinear distortion principle for nonlinear independent component analysis. Journal of Machine Learning Research, 9, 2455–2487.

Zhang, K., & Hyvärinen, A. (2009). On the identifiability of the post-nonlinear causal model. In Proceedings of the 25th conference on uncertainty in artificial intelligence (UAI2009) (pp. 647–655).

Zhang, K., & Hyvärinen, A. (2016). Nonlinear functional causal models for distinguishing causes form effect. In Wiedermann, W., & von Eye, A. (Eds.) Statistics and causality: methods for applied empirical research. Wiley.

Title: Non-Gaussian Methods for Causal Structure Learning
Author: Shohei Shimizu
Publication date: 01-04-2019
Publisher: Springer US
Published in: Prevention Science / Issue 3/2019
Print ISSN: 1389-4986
Electronic ISSN: 1573-6695
DOI: https://doi.org/10.1007/s11121-018-0901-x

Keynote webinar | Spotlight on sleep in brain health

Springer Medicine

Non-Gaussian Methods for Causal Structure Learning

Abstract

Keynote webinar | Spotlight on sleep in brain health

Springer Medicine

Abstract

Please log in to get access to this content

Other articles of this Issue 3/2019

Sexual Health, STI and HIV Risk, and Risk Perceptions Among American Indian and Alaska Native Emerging Adults

Advances in Statistical Methods for Causal Inference in Prevention Science: Introduction to the Special Section

Is Alcohol and Other Substance Use Reduced When College Students Attend Alcohol-Free Programs? Evidence from a Measurement Burst Design Before and After Legal Drinking Age

Sample Size Planning for Cluster-Randomized Interventions Probing Multilevel Mediation

Reducing Risk Behavior with Family-Centered Prevention During the Young Adult Years

Ensuring Causal, Not Casual, Inference