Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2011

Open Access 01-12-2011 | Research article

Application of latent semantic analysis for open-ended responses in a large, epidemiologic study

Authors: Travis D Leleu, Isabel G Jacobson, Cynthia A LeardMann, Besa Smith, Peter W Foltz, Paul J Amoroso, Marcia A Derr, Margaret AK Ryan, Tyler C Smith, the Millennium Cohort Study Team

Published in: BMC Medical Research Methodology | Issue 1/2011

Login to get access

Abstract

Background

The Millennium Cohort Study is a longitudinal cohort study designed in the late 1990s to evaluate how military service may affect long-term health. The purpose of this investigation was to examine characteristics of Millennium Cohort Study participants who responded to the open-ended question, and to identify and investigate the most commonly reported areas of concern.

Methods

Participants who responded during the 2001-2003 and 2004-2006 questionnaire cycles were included in this study (n = 108,129). To perform these analyses, Latent Semantic Analysis (LSA) was applied to a broad open-ended question asking the participant if there were any additional health concerns. Multivariable logistic regression was performed to examine the adjusted odds of responding to the open-text field, and cluster analysis was executed to understand the major areas of concern for participants providing open-ended responses.

Results

Participants who provided information in the open-ended text field (n = 27,916), had significantly lower self-reported general health compared with those who did not provide information in the open-ended text field. The bulk of responses concerned a finite number of topics, most notably illness/injury, exposure, and exercise.

Conclusion

These findings suggest generalized topic areas, as well as identify subgroups who are more likely to provide additional information in their response that may add insight into future epidemiologic and military research.
Literature
1.
go back to reference Papadimitriou CH, Tamaki H, Raghavan P, Vempala S: Latent semantic indexing: a probabilistic analysis. Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems. 1998, 159-168.CrossRef Papadimitriou CH, Tamaki H, Raghavan P, Vempala S: Latent semantic indexing: a probabilistic analysis. Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems. 1998, 159-168.CrossRef
2.
go back to reference Landauer TK, Dumais ST: How come you know so much? From practical problem to new memory theory. Basic and applied memory research: Theory in context. Edited by: Hermann D, McEvoy C, Hertzog C, Hertel P, Johnson M. 1996, Mahwah, NJ: Erlbaum, 105-126. Landauer TK, Dumais ST: How come you know so much? From practical problem to new memory theory. Basic and applied memory research: Theory in context. Edited by: Hermann D, McEvoy C, Hertzog C, Hertel P, Johnson M. 1996, Mahwah, NJ: Erlbaum, 105-126.
3.
go back to reference Sun J, Zhang Q, Yuan Z, Huang W, Yan X, Dong J: Research of Spam Filtering System Based on LSA and SHA. Proceedings of the 5th international symposium on Neural Networks: Advances in Neural Networks. 2008, Beijing, China: Springer-Verlag, 331-340. Sun J, Zhang Q, Yuan Z, Huang W, Yan X, Dong J: Research of Spam Filtering System Based on LSA and SHA. Proceedings of the 5th international symposium on Neural Networks: Advances in Neural Networks. 2008, Beijing, China: Springer-Verlag, 331-340.
4.
go back to reference Landauer TK, Dumais ST: A solution to Plato's problem: The Latent Semantic Analysis theory of the acquisition, induction, and representation of knowledge. Psychol Rev. 1997, 104: 211-240.CrossRef Landauer TK, Dumais ST: A solution to Plato's problem: The Latent Semantic Analysis theory of the acquisition, induction, and representation of knowledge. Psychol Rev. 1997, 104: 211-240.CrossRef
5.
go back to reference Foltz PW: Latent Semantic Analysis for text-based research. Behav Res Methods Instrum Comput. 1996, 28 (2): 197-202. 10.3758/BF03204765.CrossRef Foltz PW: Latent Semantic Analysis for text-based research. Behav Res Methods Instrum Comput. 1996, 28 (2): 197-202. 10.3758/BF03204765.CrossRef
6.
go back to reference Ryan MA, Smith TC, Smith B, Amoroso P, Boyko EJ, Gray GC, et al: Millennium Cohort: enrollment begins a 21-year contribution to understanding the impact of military service. J Clin Epidemiol. 2007, 60 (2): 181-91. 10.1016/j.jclinepi.2006.05.009.CrossRefPubMed Ryan MA, Smith TC, Smith B, Amoroso P, Boyko EJ, Gray GC, et al: Millennium Cohort: enrollment begins a 21-year contribution to understanding the impact of military service. J Clin Epidemiol. 2007, 60 (2): 181-91. 10.1016/j.jclinepi.2006.05.009.CrossRefPubMed
7.
go back to reference Littman AJ, Boyko EJ, Jacobson IG, Horton J, Gackstetter GD, Smith B, et al: Assessing nonresponse bias at follow-up in a large prospective cohort of relatively young and mobile military service members. BMC Med Res Methodol. 2010, 10 (1): 99-10.1186/1471-2288-10-99.CrossRefPubMedPubMedCentral Littman AJ, Boyko EJ, Jacobson IG, Horton J, Gackstetter GD, Smith B, et al: Assessing nonresponse bias at follow-up in a large prospective cohort of relatively young and mobile military service members. BMC Med Res Methodol. 2010, 10 (1): 99-10.1186/1471-2288-10-99.CrossRefPubMedPubMedCentral
8.
go back to reference Deerwester S, Dumais ST, Landauer TK, Furnas GW, Harshman RA: Indexing by Latent Semantic Analysis. J Am Soc Inf Sci Technol. 1990, 41 (6): 391-407. 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9.CrossRef Deerwester S, Dumais ST, Landauer TK, Furnas GW, Harshman RA: Indexing by Latent Semantic Analysis. J Am Soc Inf Sci Technol. 1990, 41 (6): 391-407. 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9.CrossRef
9.
go back to reference Landauer TK, Foltz PW, Laham D: Introduction to Latent Semantic Analysis. Discourse Process. 1998, 25: 259-284. 10.1080/01638539809545028.CrossRef Landauer TK, Foltz PW, Laham D: Introduction to Latent Semantic Analysis. Discourse Process. 1998, 25: 259-284. 10.1080/01638539809545028.CrossRef
10.
go back to reference Wells TS, Jacobson IG, Smith TC, Spooner CN, Smith B, Reed RJ, et al: Prior health care utilization as a potential determinant of enrollment in a 21-year prospective study, the Millennium Cohort Study. Eur J Epidemiol. 2008, 23 (2): 79-87. 10.1007/s10654-007-9216-0.CrossRefPubMed Wells TS, Jacobson IG, Smith TC, Spooner CN, Smith B, Reed RJ, et al: Prior health care utilization as a potential determinant of enrollment in a 21-year prospective study, the Millennium Cohort Study. Eur J Epidemiol. 2008, 23 (2): 79-87. 10.1007/s10654-007-9216-0.CrossRefPubMed
11.
go back to reference Blazer DGI, Houpt JL: Perception of poor health in the healthy older adult. J Am Geriatr Soc. 1979, 27 (7): 330-4.CrossRefPubMed Blazer DGI, Houpt JL: Perception of poor health in the healthy older adult. J Am Geriatr Soc. 1979, 27 (7): 330-4.CrossRefPubMed
12.
go back to reference Chretien JP, Chu LK, Smith TC, Smith B, Ryan MA: Demographic and occupational predictors of early response to a mailed invitation to enroll in a longitudinal health study. BMC Med Res Methodol. 2007, 7: 6-10.1186/1471-2288-7-6.CrossRefPubMedPubMedCentral Chretien JP, Chu LK, Smith TC, Smith B, Ryan MA: Demographic and occupational predictors of early response to a mailed invitation to enroll in a longitudinal health study. BMC Med Res Methodol. 2007, 7: 6-10.1186/1471-2288-7-6.CrossRefPubMedPubMedCentral
13.
go back to reference Riddle JR, Smith TC, Smith B, Corbeil TE, Engel CC, Wells TS, et al: Millennium Cohort: the 2001-2003 baseline prevalence of mental disorders in the U.S. military. J Clin Epidemiol. 2007, 60 (2): 192-201. 10.1016/j.jclinepi.2006.04.008.CrossRefPubMed Riddle JR, Smith TC, Smith B, Corbeil TE, Engel CC, Wells TS, et al: Millennium Cohort: the 2001-2003 baseline prevalence of mental disorders in the U.S. military. J Clin Epidemiol. 2007, 60 (2): 192-201. 10.1016/j.jclinepi.2006.04.008.CrossRefPubMed
14.
go back to reference Smith B, Leard CA, Smith TC, Reed RJ, Ryan MA: Anthrax vaccination in the Millennium Cohort: validation and measures of health. Am J Prev Med. 2007, 32 (4): 347-53. 10.1016/j.amepre.2006.12.015.CrossRefPubMed Smith B, Leard CA, Smith TC, Reed RJ, Ryan MA: Anthrax vaccination in the Millennium Cohort: validation and measures of health. Am J Prev Med. 2007, 32 (4): 347-53. 10.1016/j.amepre.2006.12.015.CrossRefPubMed
15.
go back to reference Smith B, Smith TC, Gray GC, Ryan MA: When epidemiology meets the Internet: Web-based surveys in the Millennium Cohort Study. Am J Epidemiol. 2007, 166 (11): 1345-54. 10.1093/aje/kwm212.CrossRefPubMed Smith B, Smith TC, Gray GC, Ryan MA: When epidemiology meets the Internet: Web-based surveys in the Millennium Cohort Study. Am J Epidemiol. 2007, 166 (11): 1345-54. 10.1093/aje/kwm212.CrossRefPubMed
16.
go back to reference Smith B, Wingard DL, Ryan MA, Macera CA, Patterson TL, Slymen DJ: U.S. military deployment during 2001-2006: comparison of subjective and objective data sources in a large prospective health study. Ann Epidemiol. 2007, 17 (12): 976-82. 10.1016/j.annepidem.2007.07.102.CrossRefPubMed Smith B, Wingard DL, Ryan MA, Macera CA, Patterson TL, Slymen DJ: U.S. military deployment during 2001-2006: comparison of subjective and objective data sources in a large prospective health study. Ann Epidemiol. 2007, 17 (12): 976-82. 10.1016/j.annepidem.2007.07.102.CrossRefPubMed
17.
go back to reference Smith TC, Jacobson IG, Smith B, Hooper TI, Ryan MA: The occupational role of women in military service: validation of occupation and prevalence of exposures in the Millennium Cohort Study. Int J Environ Health Res. 2007, 17 (4): 271-84. 10.1080/09603120701372243.CrossRefPubMed Smith TC, Jacobson IG, Smith B, Hooper TI, Ryan MA: The occupational role of women in military service: validation of occupation and prevalence of exposures in the Millennium Cohort Study. Int J Environ Health Res. 2007, 17 (4): 271-84. 10.1080/09603120701372243.CrossRefPubMed
18.
go back to reference Smith TC, Smith B, Jacobson IG, Corbeil TE, Ryan MA: Reliability of standard health assessment instruments in a large, population-based cohort study. Ann Epidemiol. 2007, 17 (7): 525-32. 10.1016/j.annepidem.2006.12.002.CrossRefPubMed Smith TC, Smith B, Jacobson IG, Corbeil TE, Ryan MA: Reliability of standard health assessment instruments in a large, population-based cohort study. Ann Epidemiol. 2007, 17 (7): 525-32. 10.1016/j.annepidem.2006.12.002.CrossRefPubMed
19.
go back to reference Smith TC, Zamorski M, Smith B, Riddle JR, Leardmann CA, Wells TS, et al: The physical and mental health of a large military cohort: baseline functional health status of the Millennium Cohort. BMC Public Health. 2007, 7: 340-10.1186/1471-2458-7-340.CrossRefPubMedPubMedCentral Smith TC, Zamorski M, Smith B, Riddle JR, Leardmann CA, Wells TS, et al: The physical and mental health of a large military cohort: baseline functional health status of the Millennium Cohort. BMC Public Health. 2007, 7: 340-10.1186/1471-2458-7-340.CrossRefPubMedPubMedCentral
20.
go back to reference LeardMann CA, Smith B, Smith TC, Wells TS, Ryan MA: Smallpox vaccination: comparison of self-reported and electronic vaccine records in the millennium cohort study. Hum Vaccin. 2007, 3 (6): 245-51.CrossRefPubMed LeardMann CA, Smith B, Smith TC, Wells TS, Ryan MA: Smallpox vaccination: comparison of self-reported and electronic vaccine records in the millennium cohort study. Hum Vaccin. 2007, 3 (6): 245-51.CrossRefPubMed
21.
go back to reference Uriell ZA, Burress L: Results of the 2005 Pregnancy and Parenthood Survey. Edited by: Navy Personnel Research S, and Technology. 2007, Millington: Bureau of Navy Personnel Uriell ZA, Burress L: Results of the 2005 Pregnancy and Parenthood Survey. Edited by: Navy Personnel Research S, and Technology. 2007, Millington: Bureau of Navy Personnel
22.
go back to reference Landauer TK, Laham D, Rehder B, Schreiner ME: How well can passage meaning be derived without using word order? A comparison of Latent Semantic Analysis and humans. Edited by: Shafto MG, Langley P. 1997, Proceedings of the 19th annual meeting of the Cognitive Science Society Mahwah, NH: Erlbaum, 412-417. Landauer TK, Laham D, Rehder B, Schreiner ME: How well can passage meaning be derived without using word order? A comparison of Latent Semantic Analysis and humans. Edited by: Shafto MG, Langley P. 1997, Proceedings of the 19th annual meeting of the Cognitive Science Society Mahwah, NH: Erlbaum, 412-417.
23.
go back to reference Landis JR, Koch GG: The measurement of observer agreement for categorical data. Biometrics. 1977, 33: 159-74. 10.2307/2529310.CrossRefPubMed Landis JR, Koch GG: The measurement of observer agreement for categorical data. Biometrics. 1977, 33: 159-74. 10.2307/2529310.CrossRefPubMed
24.
go back to reference Engel CC, Hyams KC, Scott K: Managing future Gulf War Syndromes: international lessons and new models of care. Philos Trans R Soc Lond B Biol Sci. 2006, 361 (1468): 707-20. 10.1098/rstb.2006.1829.CrossRefPubMedPubMedCentral Engel CC, Hyams KC, Scott K: Managing future Gulf War Syndromes: international lessons and new models of care. Philos Trans R Soc Lond B Biol Sci. 2006, 361 (1468): 707-20. 10.1098/rstb.2006.1829.CrossRefPubMedPubMedCentral
25.
go back to reference Neville RG, Reed C, Boswell B, Sergeant P, Sullivan T, Sullivan FM: Early experience of the use of short message service (SMS) technology in routine clinical care. Inform Prim Care. 2008, 16 (3): 203-11.PubMed Neville RG, Reed C, Boswell B, Sergeant P, Sullivan T, Sullivan FM: Early experience of the use of short message service (SMS) technology in routine clinical care. Inform Prim Care. 2008, 16 (3): 203-11.PubMed
Metadata
Title
Application of latent semantic analysis for open-ended responses in a large, epidemiologic study
Authors
Travis D Leleu
Isabel G Jacobson
Cynthia A LeardMann
Besa Smith
Peter W Foltz
Paul J Amoroso
Marcia A Derr
Margaret AK Ryan
Tyler C Smith
the Millennium Cohort Study Team
Publication date
01-12-2011
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2011
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/1471-2288-11-136

Other articles of this Issue 1/2011

BMC Medical Research Methodology 1/2011 Go to the issue