Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2014

Open Access 01-12-2014 | Research article

A comparison of three clustering methods for finding subgroups in MRI, SMS or clinical data: SPSS TwoStep Cluster analysis, Latent Gold and SNOB

Authors: Peter Kent, Rikke K Jensen, Alice Kongsted

Published in: BMC Medical Research Methodology | Issue 1/2014

Login to get access

Abstract

Background

There are various methodological approaches to identifying clinically important subgroups and one method is to identify clusters of characteristics that differentiate people in cross-sectional and/or longitudinal data using Cluster Analysis (CA) or Latent Class Analysis (LCA). There is a scarcity of head-to-head comparisons that can inform the choice of which clustering method might be suitable for particular clinical datasets and research questions. Therefore, the aim of this study was to perform a head-to-head comparison of three commonly available methods (SPSS TwoStep CA, Latent Gold LCA and SNOB LCA).

Methods

The performance of these three methods was compared: (i) quantitatively using the number of subgroups detected, the classification probability of individuals into subgroups, the reproducibility of results, and (ii) qualitatively using subjective judgments about each program’s ease of use and interpretability of the presentation of results.
We analysed five real datasets of varying complexity in a secondary analysis of data from other research projects. Three datasets contained only MRI findings (n = 2,060 to 20,810 vertebral disc levels), one dataset contained only pain intensity data collected for 52 weeks by text (SMS) messaging (n = 1,121 people), and the last dataset contained a range of clinical variables measured in low back pain patients (n = 543 people). Four artificial datasets (n = 1,000 each) containing subgroups of varying complexity were also analysed testing the ability of these clustering methods to detect subgroups and correctly classify individuals when subgroup membership was known.

Results

The results from the real clinical datasets indicated that the number of subgroups detected varied, the certainty of classifying individuals into those subgroups varied, the findings had perfect reproducibility, some programs were easier to use and the interpretability of the presentation of their findings also varied. The results from the artificial datasets indicated that all three clustering methods showed a near-perfect ability to detect known subgroups and correctly classify individuals into those subgroups.

Conclusions

Our subjective judgement was that Latent Gold offered the best balance of sensitivity to subgroups, ease of use and presentation of results with these datasets but we recognise that different clustering methods may suit other types of data and clinical research questions.
Appendix
Available only for authorised users
Literature
1.
go back to reference Hill JC, Whitehurst DG, Lewis M, Bryan S, Dunn K, Foster NE, Konstantinou K, Main CJ, Mason E, Somerville S, Sowden G, Vohora K, Hay EM: Comparison of stratified primary care management for low back pain with current best practice (STarT Back): a randomised controlled trial. Lancet. 2011, 378 (9802): 1560-1571. 10.1016/S0140-6736(11)60937-9.CrossRefPubMedPubMedCentral Hill JC, Whitehurst DG, Lewis M, Bryan S, Dunn K, Foster NE, Konstantinou K, Main CJ, Mason E, Somerville S, Sowden G, Vohora K, Hay EM: Comparison of stratified primary care management for low back pain with current best practice (STarT Back): a randomised controlled trial. Lancet. 2011, 378 (9802): 1560-1571. 10.1016/S0140-6736(11)60937-9.CrossRefPubMedPubMedCentral
2.
go back to reference Hingorani AD, Windt DA, Riley RD, Abrams K, Moons KG, Steyerberg EW, Schroter S, Sauerbrei W, Altman DG, Hemingway H: Prognosis research strategy (PROGRESS) 4: Stratified medicine research. BMJ. 2013, 346: e5793-10.1136/bmj.e5793.CrossRefPubMedPubMedCentral Hingorani AD, Windt DA, Riley RD, Abrams K, Moons KG, Steyerberg EW, Schroter S, Sauerbrei W, Altman DG, Hemingway H: Prognosis research strategy (PROGRESS) 4: Stratified medicine research. BMJ. 2013, 346: e5793-10.1136/bmj.e5793.CrossRefPubMedPubMedCentral
3.
go back to reference Lim SS, Vos T, Flaxman AD, Danaei G, Shibuya K, Adair-Rohani H, Amann M, Anderson HR, Andrews KG, Aryee M, Atkinson C, Bacchus LJ, Bahalim AN, Balakrishnan K, Balmes J, Barker-Collo S, Baxter A, Bell ML, Blore JD, Blyth F, Bonner C, Borges G, Bourne R, Boussinesq M, Brauer M, Brooks P, Bruce NG, Brunekreef B, Bryan-Hancock C, Bucello C, et al: A comparative risk assessment of burden of disease and injury attributable to 67 risk factors and risk factor clusters in 21 regions, 1990–2010: a systematic analysis for the Global Burden of Disease Study 2010. Lancet. 2012, 380 (9859): 2224-2260. 10.1016/S0140-6736(12)61766-8.CrossRefPubMedPubMedCentral Lim SS, Vos T, Flaxman AD, Danaei G, Shibuya K, Adair-Rohani H, Amann M, Anderson HR, Andrews KG, Aryee M, Atkinson C, Bacchus LJ, Bahalim AN, Balakrishnan K, Balmes J, Barker-Collo S, Baxter A, Bell ML, Blore JD, Blyth F, Bonner C, Borges G, Bourne R, Boussinesq M, Brauer M, Brooks P, Bruce NG, Brunekreef B, Bryan-Hancock C, Bucello C, et al: A comparative risk assessment of burden of disease and injury attributable to 67 risk factors and risk factor clusters in 21 regions, 1990–2010: a systematic analysis for the Global Burden of Disease Study 2010. Lancet. 2012, 380 (9859): 2224-2260. 10.1016/S0140-6736(12)61766-8.CrossRefPubMedPubMedCentral
4.
go back to reference Jensen RK, Jensen TS, Kjaer P, Kent P: Can pathoanatomical pathways of degeneration in lumbar motion segments be identified by clustering MRI findings. BMC Musculoskelet Disord. 2013, 14 (1): 198-10.1186/1471-2474-14-198.CrossRefPubMedPubMedCentral Jensen RK, Jensen TS, Kjaer P, Kent P: Can pathoanatomical pathways of degeneration in lumbar motion segments be identified by clustering MRI findings. BMC Musculoskelet Disord. 2013, 14 (1): 198-10.1186/1471-2474-14-198.CrossRefPubMedPubMedCentral
5.
go back to reference Takatalo J, Karppinen J, Niinimaki J, Taimela S, Mutanen P, Sequeiros RB, Nayha S, Jarvelin MR, Kyllonen E, Tervonen O: Association of modic changes, Schmorl's nodes, spondylolytic defects, high-intensity zone lesions, disc herniations, and radial tears with low back symptom severity among young Finnish adults. Spine. 2012, 37 (14): 1231-1239. 10.1097/BRS.0b013e3182443855.CrossRefPubMed Takatalo J, Karppinen J, Niinimaki J, Taimela S, Mutanen P, Sequeiros RB, Nayha S, Jarvelin MR, Kyllonen E, Tervonen O: Association of modic changes, Schmorl's nodes, spondylolytic defects, high-intensity zone lesions, disc herniations, and radial tears with low back symptom severity among young Finnish adults. Spine. 2012, 37 (14): 1231-1239. 10.1097/BRS.0b013e3182443855.CrossRefPubMed
6.
go back to reference Barban N, Billari FC: Classifying life course trajectories: a comparison of latent class and sequence analysis. J R Stat Soc. 2012, 61 (5): 765-784. 10.1111/j.1467-9876.2012.01047.x.CrossRef Barban N, Billari FC: Classifying life course trajectories: a comparison of latent class and sequence analysis. J R Stat Soc. 2012, 61 (5): 765-784. 10.1111/j.1467-9876.2012.01047.x.CrossRef
7.
go back to reference Axen I, Bodin L, Bergstrom G, Halasz L, Lange F, Lovgren PW, Rosenbaum A, Leboeuf-Yde C, Jensen I: Clustering patients on the basis of their individual course of low back pain over a six month period. BMC Musculoskelet Disord. 2011, 12: 99-10.1186/1471-2474-12-99.CrossRefPubMedPubMedCentral Axen I, Bodin L, Bergstrom G, Halasz L, Lange F, Lovgren PW, Rosenbaum A, Leboeuf-Yde C, Jensen I: Clustering patients on the basis of their individual course of low back pain over a six month period. BMC Musculoskelet Disord. 2011, 12: 99-10.1186/1471-2474-12-99.CrossRefPubMedPubMedCentral
8.
go back to reference Kent P, Keating JL, Leboeuf-Yde C: Research methods for subgrouping low back pain. BMC Med Res Methodol. 2010, 10: 62-10.1186/1471-2288-10-62. doi:10.1186/1471-2288-10-62CrossRefPubMedPubMedCentral Kent P, Keating JL, Leboeuf-Yde C: Research methods for subgrouping low back pain. BMC Med Res Methodol. 2010, 10: 62-10.1186/1471-2288-10-62. doi:10.1186/1471-2288-10-62CrossRefPubMedPubMedCentral
9.
go back to reference Klebanoff MA: Subgroup analysis in obstetrics clinical trials. Am J Obstet Gynecol. 2007, 197: 119-122. 10.1016/j.ajog.2007.02.030.CrossRefPubMed Klebanoff MA: Subgroup analysis in obstetrics clinical trials. Am J Obstet Gynecol. 2007, 197: 119-122. 10.1016/j.ajog.2007.02.030.CrossRefPubMed
10.
go back to reference Flynn T, Fritz JW, Whitman M, Wainner RS, Magel J, Rendeiro D, Butler B, Garber M, Allison S: A clinical prediction rule for classifying patients with low back pain who demonstrate short-term improvement with spinal manipulation. Spine. 2002, 27 (24): 2835-2843. 10.1097/00007632-200212150-00021.CrossRefPubMed Flynn T, Fritz JW, Whitman M, Wainner RS, Magel J, Rendeiro D, Butler B, Garber M, Allison S: A clinical prediction rule for classifying patients with low back pain who demonstrate short-term improvement with spinal manipulation. Spine. 2002, 27 (24): 2835-2843. 10.1097/00007632-200212150-00021.CrossRefPubMed
11.
go back to reference Beneciuk JM, Robinson ME, George SZ: Low back pain subgroups using fear-avoidance model measures: results of a cluster analysis. Clin J Pain. 2012, 28 (8): 658-666. 10.1097/AJP.0b013e31824306ed.CrossRefPubMedPubMedCentral Beneciuk JM, Robinson ME, George SZ: Low back pain subgroups using fear-avoidance model measures: results of a cluster analysis. Clin J Pain. 2012, 28 (8): 658-666. 10.1097/AJP.0b013e31824306ed.CrossRefPubMedPubMedCentral
12.
go back to reference Bacher J, Wenzig K, Vogler M: SPSS TwoStep Cluster – a first evaluation. Work and discussion paper. 2004, Erlangen-Nuremberg, Germany: Department of Sociology, Social Science Institute, Friedrich-Alexander-University, 1-30. Bacher J, Wenzig K, Vogler M: SPSS TwoStep Cluster – a first evaluation. Work and discussion paper. 2004, Erlangen-Nuremberg, Germany: Department of Sociology, Social Science Institute, Friedrich-Alexander-University, 1-30.
13.
go back to reference Gelbard R, Goldman O, Spiegler I: Investigating diversity of clustering methods: An empirical comparison. Data Knowl Eng. 2007, 63: 155-166. 10.1016/j.datak.2007.01.002.CrossRef Gelbard R, Goldman O, Spiegler I: Investigating diversity of clustering methods: An empirical comparison. Data Knowl Eng. 2007, 63: 155-166. 10.1016/j.datak.2007.01.002.CrossRef
14.
go back to reference Magidson J, Vermunt JK: Latent class models for clustering: A comparison with k-means. Can J Market Res. 2002, 20: 1-9. Magidson J, Vermunt JK: Latent class models for clustering: A comparison with k-means. Can J Market Res. 2002, 20: 1-9.
15.
go back to reference Haughton D, Legrand P, Woolford S: Review of three Latent Class Cluster Analysis packages: Latent GOLD, poLCA, and MCLUST. Am Stat. 2009, 63 (1): 81-91. 10.1198/tast.2009.0016.CrossRef Haughton D, Legrand P, Woolford S: Review of three Latent Class Cluster Analysis packages: Latent GOLD, poLCA, and MCLUST. Am Stat. 2009, 63 (1): 81-91. 10.1198/tast.2009.0016.CrossRef
16.
go back to reference SPSS: SPSS Base 17.0 Users guide. 2009, Chicago, IL, USA: SPSS Inc SPSS: SPSS Base 17.0 Users guide. 2009, Chicago, IL, USA: SPSS Inc
17.
go back to reference Vermunt JK, Magidson J: Latent Gold 4.0 users's guide. 2005, Belmont, Massachusetts, USA: Statistical Innovations Inc Vermunt JK, Magidson J: Latent Gold 4.0 users's guide. 2005, Belmont, Massachusetts, USA: Statistical Innovations Inc
18.
go back to reference Wallace CS: Statistical and inductive inference by minimum message length. 2005, New York, USA: Springer Wallace CS: Statistical and inductive inference by minimum message length. 2005, New York, USA: Springer
19.
go back to reference Wallace CS, Boulton DM: An information measure for classification. Comput J. 1968, 11 (2): 185-194. 10.1093/comjnl/11.2.185.CrossRef Wallace CS, Boulton DM: An information measure for classification. Comput J. 1968, 11 (2): 185-194. 10.1093/comjnl/11.2.185.CrossRef
20.
go back to reference Wallace CS, Dowe DL: MML clustering of multi-state, Poisson, von Mises circular and Gaussian distributions. Stat Comput. 2000, 10 (1): 73-83. 10.1023/A:1008992619036.CrossRef Wallace CS, Dowe DL: MML clustering of multi-state, Poisson, von Mises circular and Gaussian distributions. Stat Comput. 2000, 10 (1): 73-83. 10.1023/A:1008992619036.CrossRef
21.
go back to reference Kjaer P, Korsholm L, Bendix T, Sorensen JS, Leboeuf-Yde C: Modic changes and their associations with clinical findings. Eur Spine J. 2006, 15: 1312-1319. 10.1007/s00586-006-0185-x.CrossRefPubMedPubMedCentral Kjaer P, Korsholm L, Bendix T, Sorensen JS, Leboeuf-Yde C: Modic changes and their associations with clinical findings. Eur Spine J. 2006, 15: 1312-1319. 10.1007/s00586-006-0185-x.CrossRefPubMedPubMedCentral
22.
go back to reference Jensen TS, Sorensen JS, Kjaer P: Intra- and interobserver reproducibility of vertebral endplate signal (modic) changes in the lumbar spine: The Nordic modic consensus group classification. Acta Radiol. 2007, 48: 748-754. 10.1080/02841850701422112.CrossRefPubMed Jensen TS, Sorensen JS, Kjaer P: Intra- and interobserver reproducibility of vertebral endplate signal (modic) changes in the lumbar spine: The Nordic modic consensus group classification. Acta Radiol. 2007, 48: 748-754. 10.1080/02841850701422112.CrossRefPubMed
23.
go back to reference Jensen RK, Leboeuf-Yde C, Wedderkopp N, Sorensen JS, Manniche C: Rest versus exercise as treatment for patients with low back pain and Modic changes. A randomized controlled clinical trial. BMC Med. 2012, 10: 22-10.1186/1741-7015-10-22.CrossRefPubMedPubMedCentral Jensen RK, Leboeuf-Yde C, Wedderkopp N, Sorensen JS, Manniche C: Rest versus exercise as treatment for patients with low back pain and Modic changes. A randomized controlled clinical trial. BMC Med. 2012, 10: 22-10.1186/1741-7015-10-22.CrossRefPubMedPubMedCentral
24.
go back to reference Albert HB, Briggs AM, Kent P, Byrhagen A, Hansen C, Kjaergaard K: The prevalence of MRI-defined spinal pathoanatomies and their association with modic changes in individuals seeking care for low back pain. Eur Spine J. 2011, 20 (8): 1355-1362. 10.1007/s00586-011-1794-6.CrossRefPubMedPubMedCentral Albert HB, Briggs AM, Kent P, Byrhagen A, Hansen C, Kjaergaard K: The prevalence of MRI-defined spinal pathoanatomies and their association with modic changes in individuals seeking care for low back pain. Eur Spine J. 2011, 20 (8): 1355-1362. 10.1007/s00586-011-1794-6.CrossRefPubMedPubMedCentral
25.
go back to reference Kent P, Briggs AM, Albert HB, Byrhagen A, Hansen C, Kjaergaard K, Jensen TS: Inexperienced clinicians can extract pathoanatomic information from MRI narrative reports with high reproducibility for use in research/quality assurance. Chiropr Man Therap. 2011, 19 (1): 16-10.1186/2045-709X-19-16.CrossRefPubMedPubMedCentral Kent P, Briggs AM, Albert HB, Byrhagen A, Hansen C, Kjaergaard K, Jensen TS: Inexperienced clinicians can extract pathoanatomic information from MRI narrative reports with high reproducibility for use in research/quality assurance. Chiropr Man Therap. 2011, 19 (1): 16-10.1186/2045-709X-19-16.CrossRefPubMedPubMedCentral
26.
go back to reference Eirikstoft H, Kongsted A: Patient characteristics in low back pain subgroups based on an existing classification system. A descriptive cohort study in chiropractic practice. Man Ther. 2014, 19 (1): 65-71. 10.1016/j.math.2013.07.007.CrossRefPubMed Eirikstoft H, Kongsted A: Patient characteristics in low back pain subgroups based on an existing classification system. A descriptive cohort study in chiropractic practice. Man Ther. 2014, 19 (1): 65-71. 10.1016/j.math.2013.07.007.CrossRefPubMed
27.
go back to reference Kent P, Kongsted A: Identifying clinical course patterns in SMS data using cluster analysis. Chiropr Man Therap. 2012, 20 (1): 20-10.1186/2045-709X-20-20.CrossRefPubMedPubMedCentral Kent P, Kongsted A: Identifying clinical course patterns in SMS data using cluster analysis. Chiropr Man Therap. 2012, 20 (1): 20-10.1186/2045-709X-20-20.CrossRefPubMedPubMedCentral
28.
go back to reference Kongsted A, Johannesen E, Leboeuf-Yde C: Feasibility of the STarT back screening tool in chiropractic clinics: a cross-sectional study of patients with low back pain. Chiropr Man Therap. 2011, 19: 10-10.1186/2045-709X-19-10.CrossRefPubMedPubMedCentral Kongsted A, Johannesen E, Leboeuf-Yde C: Feasibility of the STarT back screening tool in chiropractic clinics: a cross-sectional study of patients with low back pain. Chiropr Man Therap. 2011, 19: 10-10.1186/2045-709X-19-10.CrossRefPubMedPubMedCentral
29.
go back to reference Eshghi A, Haughton D, Legrand P, Skaletsky M, Woolford S: Identifying groups: A comparison of methodologies. J Data Sci. 2011, 9: 271-291. Eshghi A, Haughton D, Legrand P, Skaletsky M, Woolford S: Identifying groups: A comparison of methodologies. J Data Sci. 2011, 9: 271-291.
30.
go back to reference Twisk J, Hoekstra T: Classifying developmental trajectories over time should be done with great caution: a comparison between methods. J Clin Epidemiol. 2012, 65 (10): 1078-1087. 10.1016/j.jclinepi.2012.04.010.CrossRefPubMed Twisk J, Hoekstra T: Classifying developmental trajectories over time should be done with great caution: a comparison between methods. J Clin Epidemiol. 2012, 65 (10): 1078-1087. 10.1016/j.jclinepi.2012.04.010.CrossRefPubMed
Metadata
Title
A comparison of three clustering methods for finding subgroups in MRI, SMS or clinical data: SPSS TwoStep Cluster analysis, Latent Gold and SNOB
Authors
Peter Kent
Rikke K Jensen
Alice Kongsted
Publication date
01-12-2014
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2014
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/1471-2288-14-113

Other articles of this Issue 1/2014

BMC Medical Research Methodology 1/2014 Go to the issue