Skip to main content
Top
Published in: Prevention Science 4/2017

01-05-2017

A Successful Strategy for Linking Anonymous Data from Students’ and Parents’ Questionnaires Using Self-Generated Identification Codes

Authors: Jaroslav Vacek, Hana Vonkova, Roman Gabrhelík

Published in: Prevention Science | Issue 4/2017

Login to get access

Abstract

We conducted a feasibility study for matching children (N = 2571, average age 12 years, 50.4% female) and their parents (N = 1931, average age 41 years, 83.3% female) represented by an anonymous self-generated identification code (SGIC) and assessed its methodological properties. We used a nine-character SGIC with the children and a mirrored version of the same code with the parents. The average overall error rate in generating the SGIC was 9.7% (4.0% in the parents and 13.9% in the children). We were able to link a total of 1765 parents’ and children’s codes uniquely (94.9% of all possible dyads) with any four-character combination and the employment of the “school” variable. The overall matching quality of linking using the SGIC only is characterized by precision (positive predictive value) of 0.979, recall (sensitivity, true positive rate) of 0.934, and an F-measure (harmonic mean of precision and recall) of 0.956. The analysis of the discrepant characters in the dyads identified the paternal grandmother’s name and eye color as those varying most often. This study is the first to look at SGIC match rates and error and omission rates in linking different subjects into dyads in prevention research. We identified a high number of unique child-parent matches while guaranteeing anonymity to the participants. We provided evidence that our SGIC is a suitable tool for between-group linking procedures and has a highly successful matching rate, while maintaining anonymity in the school-based prevention study samples.
Appendix
Available only for authorised users
Literature
go back to reference Bjarnason, T., & Adalbjarnardottir, S. (2000). Anonymity and confidentiality in school surveys on alcohol, tobacco, and cannabis use. Journal of Drug Issues, 30, 335–343.CrossRef Bjarnason, T., & Adalbjarnardottir, S. (2000). Anonymity and confidentiality in school surveys on alcohol, tobacco, and cannabis use. Journal of Drug Issues, 30, 335–343.CrossRef
go back to reference Christen, P. (2012). Data matching: Concepts and techniques for record linkage, entity resolution, and duplicate detection. Berlin: Springer Berlin Heidelberg.CrossRef Christen, P. (2012). Data matching: Concepts and techniques for record linkage, entity resolution, and duplicate detection. Berlin: Springer Berlin Heidelberg.CrossRef
go back to reference Christen, P., & Goiser, K. (2007). Quality and complexity measures for data linkage and deduplication. In F. J. Guillet & H. J. Hamilton (Eds.), Quality measures in data mining (pp. 127–151). Berlin: Springer Berlin Heidelberg.CrossRef Christen, P., & Goiser, K. (2007). Quality and complexity measures for data linkage and deduplication. In F. J. Guillet & H. J. Hamilton (Eds.), Quality measures in data mining (pp. 127–151). Berlin: Springer Berlin Heidelberg.CrossRef
go back to reference Fernandez-Hermida, J. R., Calafat, A., Becona, E., Secades-Villa, R., Juan, M., & Sumnall, H. (2013). Cross-national study on factors that influence parents’ knowledge about their children’s alcohol use. Journal of Drug Education, 43, 155–172. doi:10.2190/De.43.2.D.CrossRefPubMed Fernandez-Hermida, J. R., Calafat, A., Becona, E., Secades-Villa, R., Juan, M., & Sumnall, H. (2013). Cross-national study on factors that influence parents’ knowledge about their children’s alcohol use. Journal of Drug Education, 43, 155–172. doi:10.​2190/​De.​43.​2.​D.CrossRefPubMed
go back to reference Gabrhelík, R., Orosová, O., Miovský, M., Voňková, H., Berinšterová, M., & Minařík, J. (2014). Studying the effectiveness of school-based universal prevention interventions in the Czech Republic and Slovakia. Adiktologie, 14, 403–408. Gabrhelík, R., Orosová, O., Miovský, M., Voňková, H., Berinšterová, M., & Minařík, J. (2014). Studying the effectiveness of school-based universal prevention interventions in the Czech Republic and Slovakia. Adiktologie, 14, 403–408.
go back to reference Galanti, M. R., Siliquini, R., Cuomo, L., Melero, J. C., Panella, M., & Faggiano, F. (2007). Testing anonymous link procedures for follow-up of adolescents in a school-based trial: The EU-DAP pilot study. Preventive Medicine, 44, 174–177. doi:10.1016/j.ypmed.2006.07.019.CrossRefPubMed Galanti, M. R., Siliquini, R., Cuomo, L., Melero, J. C., Panella, M., & Faggiano, F. (2007). Testing anonymous link procedures for follow-up of adolescents in a school-based trial: The EU-DAP pilot study. Preventive Medicine, 44, 174–177. doi:10.​1016/​j.​ypmed.​2006.​07.​019.CrossRefPubMed
go back to reference Gfroerer, J., &Kennet, J. (2014). Collecting survey data on sensitive topics: Substance use. Health Survey Methods, 447–472. Gfroerer, J., &Kennet, J. (2014). Collecting survey data on sensitive topics: Substance use. Health Survey Methods, 447–472.
go back to reference Jurczyk, P., Lu, J. J., Xiong, L., Cragan, J. D., & Correa, A. (2008). Fine-grained record integration and linkage tool. Birth Defects Research. Part A, Clinical and Molecular Teratology, 82, 822–829. doi:10.1002/bdra.20521.CrossRefPubMed Jurczyk, P., Lu, J. J., Xiong, L., Cragan, J. D., & Correa, A. (2008). Fine-grained record integration and linkage tool. Birth Defects Research. Part A, Clinical and Molecular Teratology, 82, 822–829. doi:10.​1002/​bdra.​20521.CrossRefPubMed
go back to reference Kristjansson, A. L., Sigfusdottir, I. D., Sigfusson, J., & Allegrante, J. P. (2014). Self-generated identification codes in longitudinal prevention research with adolescents: A pilot study of matched and unmatched subjects. Prevention Science, 15, 205–212. doi:10.1007/s11121-013-0372-z.CrossRefPubMed Kristjansson, A. L., Sigfusdottir, I. D., Sigfusson, J., & Allegrante, J. P. (2014). Self-generated identification codes in longitudinal prevention research with adolescents: A pilot study of matched and unmatched subjects. Prevention Science, 15, 205–212. doi:10.​1007/​s11121-013-0372-z.CrossRefPubMed
go back to reference Wilson, A. L. G., Hoge, C. W., McGurk, D., Thomas, J. L., Clark, J. C., & Castro, C. A. (2010). Application of a new method for linking anonymous survey data in a population of soldiers returning from Iraq. Annals of Epidemiology, 20, 931–938. doi:10.1016/j.annepidem.2010.08.008.CrossRef Wilson, A. L. G., Hoge, C. W., McGurk, D., Thomas, J. L., Clark, J. C., & Castro, C. A. (2010). Application of a new method for linking anonymous survey data in a population of soldiers returning from Iraq. Annals of Epidemiology, 20, 931–938. doi:10.​1016/​j.​annepidem.​2010.​08.​008.CrossRef
Metadata
Title
A Successful Strategy for Linking Anonymous Data from Students’ and Parents’ Questionnaires Using Self-Generated Identification Codes
Authors
Jaroslav Vacek
Hana Vonkova
Roman Gabrhelík
Publication date
01-05-2017
Publisher
Springer US
Published in
Prevention Science / Issue 4/2017
Print ISSN: 1389-4986
Electronic ISSN: 1573-6695
DOI
https://doi.org/10.1007/s11121-017-0772-6

Other articles of this Issue 4/2017

Prevention Science 4/2017 Go to the issue