Skip to main content
Top
Published in: AIDS and Behavior 7/2018

01-07-2018 | Original Paper

An Online Risk Index for the Cross-Sectional Prediction of New HIV Chlamydia, and Gonorrhea Diagnoses Across U.S. Counties and Across Years

Authors: Man-pui Sally Chan, Sophie Lohmann, Alex Morales, Chengxiang Zhai, Lyle Ungar, David R. Holtgrave, Dolores Albarracín

Published in: AIDS and Behavior | Issue 7/2018

Login to get access

Abstract

The present study evaluated the potential use of Twitter data for providing risk indices of STIs. We developed online risk indices (ORIs) based on tweets to predict new HIV, gonorrhea, and chlamydia diagnoses, across U.S. counties and across 5 years. We analyzed over one hundred million tweets from 2009 to 2013 using open-vocabulary techniques and estimated the ORIs for a particular year by entering tweets from the same year into multiple semantic models (one for each year). The ORIs were moderately to strongly associated with the actual rates (.35 < rs < .68 for 93% of models), both nationwide and when applied to single states (California, Florida, and New York). Later models were slightly better than older ones at predicting gonorrhea and chlamydia, but not at predicting HIV. The proposed technique using free social media data provides signals of community health at a high temporal and spatial resolution.
Appendix
Available only for authorised users
Footnotes
1
We compared a set of random tweets with and without location/coordinate information (N = 3,000), which showed remarkably similar vocabulary sizes: yes = 10,911 and no = 11,148, character count (per tweet): yes = 90 and no = 87, and word count (per tweet): yes = 15 and no = 14.
 
2
Google Maps Geocoding API is not a free web service (see https://​developers.​google.​com/​maps/​faq for detailed pricing). Therefore, it is necessary to develop a reliable geo-mapping program for mapping millions of tweets.
 
3
About 19% of all tweets can be geo-mapped and left a large part of tweets excluded from the analyses. We can’t assess the model performance of tweets that are with and without location/coordinate information because all HIV/STI new diagnoses rates are reported at the county-level.
 
4
We present the rank-based residuals of the actual (non-log-transformed) STI rates and the back-transformed ORIs (see Table in Supplementary Information). The overall residuals showed negligible differences for HIV, gonorrhea, and chlamydia, implying that the semantic models showed no strong biases. Altogether, the semantic models using Twitter language thus provided satisfactory performance in estimating the county-level STI risk.
 
Literature
1.
go back to reference Centers for Disease Control and Prevention. Sexually Transmitted Disease Surveillance 2016. Atlanta, GA; 2017 Centers for Disease Control and Prevention. Sexually Transmitted Disease Surveillance 2016. Atlanta, GA; 2017
6.
go back to reference Garcia-Calleja JM, Jacobson J, Garg R, et al. Has the quality of serosurveillance in low- and middle-income countries improved since the last HIV estimates round in 2007? Status and trends through 2009. Sex Transm Infect. 2010;86(Suppl 2):ii35–ii42. https://doi.org/10.1136/sti.2010.043653. Garcia-Calleja JM, Jacobson J, Garg R, et al. Has the quality of serosurveillance in low- and middle-income countries improved since the last HIV estimates round in 2007? Status and trends through 2009. Sex Transm Infect. 2010;86(Suppl 2):ii35–ii42. https://​doi.​org/​10.​1136/​sti.​2010.​043653.
19.
go back to reference Harfenist E, Cohen A. How opioid addicts are using social media to get clean. The Week. April 30, 2017. Harfenist E, Cohen A. How opioid addicts are using social media to get clean. The Week. April 30, 2017.
37.
go back to reference Schwartz HA, Eichstaedt JC, Kern ML, et al. Characterizing geographic variation in well-being using tweets. In: Proceedings of the Seventh International AAAI Conference on Weblogs and Social Media (ICWSM). Boston, MA 2013. Schwartz HA, Eichstaedt JC, Kern ML, et al. Characterizing geographic variation in well-being using tweets. In: Proceedings of the Seventh International AAAI Conference on Weblogs and Social Media (ICWSM). Boston, MA 2013.
38.
go back to reference Schwartz HA, Giorgi S, Sap M, Crutchley P, Eichstaedt JC, Ungar LH. DLATK: Differential language analysis ToolKit. In: Proceedings of the 2017 EMNLP system demonstrations. 2017:55–60. Schwartz HA, Giorgi S, Sap M, Crutchley P, Eichstaedt JC, Ungar LH. DLATK: Differential language analysis ToolKit. In: Proceedings of the 2017 EMNLP system demonstrations. 2017:55–60.
41.
go back to reference Gouws S, Metzler D, Cai C, Hovy E, Rey M. Contextual bearing on linguistic variation in social media. In: Proceedings of the workshop on languages in social media. 2011, pp. 20–29 Gouws S, Metzler D, Cai C, Hovy E, Rey M. Contextual bearing on linguistic variation in social media. In: Proceedings of the workshop on languages in social media. 2011, pp. 20–29
46.
go back to reference Karon BP. The clinical interpretation of the Thematic Apperception Test, Rorschach, and other clinical data: a reexamination of statistical versus clinical prediction. Prof Psychol Res Pract. 2000;31(2):230–3.CrossRef Karon BP. The clinical interpretation of the Thematic Apperception Test, Rorschach, and other clinical data: a reexamination of statistical versus clinical prediction. Prof Psychol Res Pract. 2000;31(2):230–3.CrossRef
51.
go back to reference Howell DC. Statistical methods for psychology. 6th ed. Belmont, CA: Thomson Wadsworth; 2007. Howell DC. Statistical methods for psychology. 6th ed. Belmont, CA: Thomson Wadsworth; 2007.
Metadata
Title
An Online Risk Index for the Cross-Sectional Prediction of New HIV Chlamydia, and Gonorrhea Diagnoses Across U.S. Counties and Across Years
Authors
Man-pui Sally Chan
Sophie Lohmann
Alex Morales
Chengxiang Zhai
Lyle Ungar
David R. Holtgrave
Dolores Albarracín
Publication date
01-07-2018
Publisher
Springer US
Published in
AIDS and Behavior / Issue 7/2018
Print ISSN: 1090-7165
Electronic ISSN: 1573-3254
DOI
https://doi.org/10.1007/s10461-018-2046-0

Other articles of this Issue 7/2018

AIDS and Behavior 7/2018 Go to the issue