Top

Published in:

01-08-2020

Finite Mixture Models with Student t Distributions: an Applied Example

Author: Albert J. Burgess-Hull

Published in: Prevention Science | Issue 6/2020

Abstract

The use of finite mixture modeling (FMM) to identify unobservable or latent groupings of individuals within a population has increased rapidly in applied prevention research. However, many prevention scientists are still unaware of the statistical assumptions underlying FMM. In particular, finite mixture models (FMMs) typically assume that the observed indicator variables are normally distributed within each latent subgroup (i.e., within-class normality). These assumptions are rarely met in applied psychological and prevention research, and violating these assumptions when fitting a FMM can lead to the identification of spurious subgroups and/or biased parameter estimates. Although new methods have been developed that relax the within-class normality assumption when fitting a FMM, prevention scientists continue to rely on FMM methods that assume within-class normality. The purpose of the current article is to introduce prevention researchers to a FMM method for heavy-tailed data: FMM with Student t distributions. We begin by reviewing the distributional assumptions that underlie FMM and the limitations of FMM with normal distributions. Next, we introduce FMM with Student t distributions, and show, step by step, the analytic and substantive results of fitting a FMM with normal and Student t distributions to data from a smoking-cessation trial. Finally, we extend the results of the applied example to draw conclusions about the use of FMM with Student t distributions in applied settings and to provide guidelines for researchers who wish to use these methods in their own research.

Available only for authorised users

These data are part of a larger simulation study conducted to highlight the potential dangers of fitting a FMM-n to non-normally distributed datasets. The results of this study are available in the online supplemental material.

Soft-randomization scheme: uses initialization values for the ECM algorithm between 0 and 1. Hard-randomization scheme: uses initialization values of either 0 or 1 for the ECM algorithm.

The k-means initialization procedure derives initialization values from a k-means clustering procedure and uses the parameter estimates derived from the k-means analysis as starting values in the ECM algorithm.

A model’s entropy is an aggregate measure of a model’s classification uncertainty and is derived from each individual’s posterior probability of membership in a particular subgroup. Entropy scores range from 0.00–1.00 with higher values (> .80) indicating that there is adequate separation between the identified subgroups (Asparouhov and Muthén 2018). R code to derive entropy scores from a fitted FMM is available in the online supplemental material.

To assign each individual to a unique subgroup, we took a “classify-analyze” approach where participants were assigned to the subgroup corresponding to their highest posterior probabilities. This approach was deemed appropriate because the majority of the identified models had an entropy ≥ 0.90 (Clark and Muthén 2009).

Andrews, J. L., & McNicholas, P. D. (2012). Model-based clustering, classification, and discriminant analysis via mixtures of multivariate t-distributions. Statistics and Computing, 22, 1021–1029. https://doi.org/10.1007/s11222-011-9272-x.

Andrews, J. L., Wickins, J. R., Boers, N. M., & McNicholas, P. D. (2018). teigen: An R package for model-based clustering and classification via the multivariate t distribution. Journal of Statistical Software, 83, 1–32. https://doi.org/10.18637/jss.v083.i07.

Andrews, J. L., McNicholas, P. D., & Subedi, S. (2011). Model-based classification via mixtures of multivariate t-distributions. Computational Statistics & Data Analysis, 55, 520–529.

Asparouhov, T., & Muthén, B. (2016). Structural equation models and mixture models with continuous nonnormal skewed distributions. Structural Equation Modeling: A Multidisciplinary Journal, 23, 1–19.

Asparouhov, T., & Muthén, B. (2018). Variable-specific entropy contribution. Retrieved from http://www.statmodel.com/download/UnivariateEntropy.pdf.

Bauer, D. J. (2007). Observations on the use of growth mixture models in psychological research. Multivariate Behavioral Research, 42, 757–786.

Bauer, D. J., & Curran, P. J. (2003). Distributional assumptions of growth mixture models: Implications for overextraction of latent trajectory classes. Psychological Methods, 8, 338–363. https://doi.org/10.1037/1082-989X.8.3.338.

Bauer, D. J., & Curran, P. J. (2004). The integration of continuous and discrete latent variable models: potential problems and promising opportunities. Psychological Methods, 9, 3–29. https://doi.org/10.1037/1082-989X.9.1.3.

Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B (Methodological), 57, 289–300.

Blanca, M. J., Arnau, J., López-Montiel, D., Bono, R., & Bendayan, R. (2013). Skewness and kurtosis in real data samples. Methodology, 9, 78–84.

Bonanno, G. A., & Mancini, A. D. (2012). Beyond resilience and PTSD: Mapping the heterogeneity of responses to potential trauma. Psychological Trauma: Theory, Research, Practice, and Policy, 4, 74–83. https://doi.org/10.1037/a0017829.

Bonanno, G. A., Ho, S. M. Y., Chan, J. C. K., Kwong, R. S. Y., Cheung, C. K. Y., Wong, C. P. Y., & Wong, V. C. W. (2008). Psychological resilience and dysfunction among hospitalized survivors of the SARS epidemic in Hong Kong: A latent class approach. Health Psychology, 27, 659–667. https://doi.org/10.1037/0278-6133.27.5.659.

Burgess-Hull, A. J., Roberts, L. J., Piper, M. E., & Baker, T. B. (2018). The social networks of smokers attempting to quit: An empirically derived and validated classification. Psychology of Addictive Behaviors, 32, 64–75. https://doi.org/10.1037/adb0000336.

Clark, S. L., & Muthén, B. (2009). Relating latent class analysis results to variables not included in the analysis. Retrieved from: https://www.statmodel.com/download/relatinglca.pdf

Cudeck, R., & Henly, S. J. (2003). A realistic perspective on pattern representation in growth data: Comment on Bauer and Curran (2003). Psychological Methods, 8, 378–383.

Forster, M. R. (2000). Key concepts in model selection: Performance and generalizability. Journal of Mathematical Psychology, 44, 205–231.

Forster, M. (2004). Simplicity and unification in model selection. Retrieved from http://philosophy.wisc.edu/forster/520/Chapter 3.pdf.

Fraley, C., & Raftery, A. E. (1998). How many clusters? Which clustering method? Answers via model-based cluster analysis. Computer Journal, 41, 586–588.

Gerogiannis, D., Nikou, C., & Likas, A. (2009). The mixtures of Student’s t-distributions as a robust framework for rigid registration. Image and Vision Computing, 27, 1285–1294.

Gibson, W. A. (1959). Three multivariate models: Factor analysis, latent structure analysis and latent profile analysis. Psychometrika, 24, 229–252. https://doi.org/10.1007/BF02289845.

Hennig, C. (2015). What are the true clusters? Pattern Recognition Letters, 64, 53–62.

Jackson, K. M., Sher, K. J., & Wood, P. K. (2000). Trajectories of concurrent substance use disorders: A developmental, typological approach to comorbidity. Alcoholism: Clinical and Experimental Research, 24, 902–913.

Krueger, R. F., Markon, K. E., Patrick, C. J., & Iacono, W. G. (2005). Externalizing psychopathology in adulthood: a dimensional-spectrum conceptualization and its implications for DSM-V. Journal of Abnormal Psychology, 114, 537.

Lange, K. L., Little, R. J., & Taylor, J. M. (1989). Robust statistical modeling using the t distribution. Journal of the American Statistical Association, 84, 881–896.

Lanza, S. T., & Rhoades, B. L. (2013). Latent class analysis: An alternative perspective on subgroup analysis in prevention and treatment. Prevention Science, 14, 157–168.

Lee, S. X., & Mclachlan, G. J. (2013). On mixtures of skew normal and skew t-distributions. Advances in Data Analysis and Classification, 7, 241–266.

Lei, H., Nahum-Shani, I., Lynch, K., Oslin, D., & Murphy, S. a. (2012). A “SMART” design for building individualized treatment sequences. Annual Review of Clinical Psychology, 8, 21–48. https://doi.org/10.1146/annurev-clinpsy-032511-143152.

Lo, Y., Mendell, N. R., & Rubin, D. B. (2001). Testing the number of components in a normal mixture. Biometrika, 88, 767–778. https://doi.org/10.1093/biomet/88.3.767.

Mann, H. B., & Whitney, D. R. (1947). On a test of whether one of two random variables is stochastically larger than the other. The Annals of Mathematical Statistics, 18, 50–60.

McLachlan, G. J., & Peel, D. (2000). Finite mixture models. Wiley.

McLachlan, G. J., & Peel, D. (1998). Robust cluster analysis via mixtures of multivariate t-distributions. In A. Amin, D. Dori, P. Pudil, & H. Freeman (Eds.), Advances in pattern recognition. SSPR /SPR 1998 (pp. 658–666). Berlin, Heidelberg: Springer.

McNicholas, P. D., & Subedi, S. (2012). Clustering gene expression time course data using mixtures of multivariate t-distributions. Journal of Statistical Planning and Inference, 142, 1114–1127.

Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105, 156–166. https://doi.org/10.1037/0033-2909.105.1.156.

Muthén, B. (2003). Statistical and substantive checking in growth mixture modeling: Comment on Bauer and Curran (2003). Psychological Methods, 8, 369–377.

Muthén, L. K., & Muthén, B. O. (1998-2017). MPlus User’s Guide (Eighth ed.). Los Angeles, CA: Muthén & Muthén.

Nagin, D. S., & Tremblay, R. E. (2005). Developmental trajectory groups: Fact or a useful statistical fiction? Criminology, 43, 873–904.

Nylund, K. L., Asparouhov, T., & Muthén, B. O. (2007). Deciding on the number of classes in latent class analysis and growth mixture modeling: A Monte Carlo simulation study. Structural Equation Modeling: A Multidisciplinary Journal, 14, 535–569.

Peel, D., & McLachlan, G. J. (2000). Robust mixture modelling using the t distribution. Statistics and Computing, 10, 339–348. https://doi.org/10.1023/A:1008981510081.

Piper, M. E., Smith, S. S., Schlam, T. R., Fiore, M. C., Jorenby, D. E., Fraser, D., & Baker, T. B. (2009). A randomized placebo-controlled clinical trial of 5 smoking cessation pharmacotherapies. Archives of General Psychiatry, 66, 1253–1262.

Piper, M. E., Cook, J. W., Schlam, T. R., Jorenby, D. E., Smith, S. S., Bolt, D. M., & Loh, W. Y. (2010). Gender, race, and education differences in abstinence rates among participants in two randomized smoking cessation trials. Nicotine & Tobacco Research, 12, 647–657.

Posada, D., & Buckley, T. R. (2004). Model selection and model averaging in phylogenetics: Advantages of Akaike information criterion and Bayesian approaches over likelihood ratio tests. Systematic Biology, 53, 793–808.

R Core Team. (2019). R: A language and environment for statistical computing. In R Foundation for statistical computing. Vienna: Austria. URL https://www.R-project.org/.

Rocke, D. M., & Woodruff, D. L. (1997). Robust estimation of multivariate location and shape. Journal of Statistical Planning and Inference, 57, 245–255.

Sampson, R. J., & Laub, J. H. (2005). Seductions of method: rejoinder to nagin and tremblay's “Developmental trajectory groups: Fact or fiction?”. Criminology, 43, 905–913.

Tofighi, D., & Enders, C. K. (2008). Identifying the correct number of classes in growth mixture models. In Advances in Latent Variable Mixture Models (pp. 317–341). Information age publishing.

Van Horn, M. L., Smith, J., Fagan, A. A., Jaki, T., Feaster, D. J., Masyn, K., et al. (2012). Not quite normal: Consequences of violating the assumption of normality in regression mixture models. Structural Equation Modeling: A Multidisciplinary Journal, 19, 227–249.

Vermunt, J., & Magidson, J. (2002). Latent class cluster analysis. In J. Hagenaars & a. McCutcheon (Eds.), Applied latent class analysis (pp. 89–106).

Vrbik, I., & Mcnicholas, P. D. (2014). Parsimonious skew mixture models for model-based clustering and classification. Computational Statistics & Data Analysis, 71, 196–210.

Vuong, Q. H. (1989). Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica: Journal of the Econometric Society, 307–333

Title: Finite Mixture Models with Student t Distributions: an Applied Example
Author: Albert J. Burgess-Hull
Publication date: 01-08-2020
Publisher: Springer US
Published in: Prevention Science / Issue 6/2020
Print ISSN: 1389-4986
Electronic ISSN: 1573-6695
DOI: https://doi.org/10.1007/s11121-020-01109-3

At a glance: The ONWARDS insulin icodec trials

Springer Medicine

Finite Mixture Models with Student t Distributions: an Applied Example

Abstract

At a glance: The ONWARDS insulin icodec trials

Springer Medicine

Abstract

Please log in to get access to this content

Other articles of this Issue 6/2020

Lifestyle and Psychosocial Patterns and Diabetes Incidence Among Women with and Without Obesity: a Prospective Latent Class Analysis

The User-Program Interaction: How Teacher Experience Shapes the Relationship Between Intervention Packaging and Fidelity to a State-Adopted Health Curriculum

Developmental Differences in the Association of Peer Relationships with Traumatic Stress Symptoms

Effects of ePREP and OurRelationship on Low-Income Couples’ Mental Health and Health Behaviors: a Randomized Controlled Trial

Ready, Willing and Able? An Investigation of the Theory of Planned Behaviour in Help-Seeking for a Community Sample with Current Untreated Depressive Symptoms

The Role of Future Orientation and Self-determination on American Indian Adolescents’ Intentions to Use Alcohol and Marijuana