Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2014

Open Access 01-12-2014 | Research article

A community assessment of privacy preserving techniques for human genomes

Authors: Xiaoqian Jiang, Yongan Zhao, Xiaofeng Wang, Bradley Malin, Shuang Wang, Lucila Ohno-Machado, Haixu Tang

Published in: BMC Medical Informatics and Decision Making | Special Issue 1/2014

Login to get access

Abstract

To answer the need for the rigorous protection of biomedical data, we organized the Critical Assessment of Data Privacy and Protection initiative as a community effort to evaluate privacy-preserving dissemination techniques for biomedical data. We focused on the challenge of sharing aggregate human genomic data (e.g., allele frequencies) in a way that preserves the privacy of the data donors, without undermining the utility of genome-wide association studies (GWAS) or impeding their dissemination. Specifically, we designed two problems for disseminating the raw data and the analysis outcome, respectively, based on publicly available data from HapMap and from the Personal Genome Project. A total of six teams participated in the challenges. The final results were presented at a workshop of the iDASH (integrating Data for Analysis, 'anonymization,' and SHaring) National Center for Biomedical Computing. We report the results of the challenge and our findings about the current genome privacy protection techniques.
Literature
1.
go back to reference Ohno-Machado L: Sharing data for the public good and protecting individual privacy: informatics solutions to combine different goals. J Am Med Inform Assoc. 2013, 20: 1-10.1136/amiajnl-2012-001513.PubMedCentralCrossRefPubMed Ohno-Machado L: Sharing data for the public good and protecting individual privacy: informatics solutions to combine different goals. J Am Med Inform Assoc. 2013, 20: 1-10.1136/amiajnl-2012-001513.PubMedCentralCrossRefPubMed
2.
go back to reference Willer CJ, Li Y, Abecasis GR: METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics. 2010, 26: 2190-1. 10.1093/bioinformatics/btq340.PubMedCentralCrossRefPubMed Willer CJ, Li Y, Abecasis GR: METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics. 2010, 26: 2190-1. 10.1093/bioinformatics/btq340.PubMedCentralCrossRefPubMed
3.
go back to reference Green ED, Guyer MS, Institute NHGR, et al: Charting a course for genomic medicine from base pairs to bedside. Nature. 2011, 470: 204-13. 10.1038/nature09764.CrossRefPubMed Green ED, Guyer MS, Institute NHGR, et al: Charting a course for genomic medicine from base pairs to bedside. Nature. 2011, 470: 204-13. 10.1038/nature09764.CrossRefPubMed
4.
go back to reference McGuire AL, Fisher R, Cusenza P, et al: Confidentiality, privacy, and security of genetic and genomic test information in electronic health records: points to consider. Genet Med. 2008, 10: 495-9. 10.1097/GIM.0b013e31817a8aaa.CrossRefPubMed McGuire AL, Fisher R, Cusenza P, et al: Confidentiality, privacy, and security of genetic and genomic test information in electronic health records: points to consider. Genet Med. 2008, 10: 495-9. 10.1097/GIM.0b013e31817a8aaa.CrossRefPubMed
5.
go back to reference Shoenbill K, Fost N, Tachinardi U, et al: Genetic data and electronic health records: a discussion of ethical, logistical and technological considerations. J Am Med Inform Assoc. 2013, 21: 171-80.PubMedCentralCrossRefPubMed Shoenbill K, Fost N, Tachinardi U, et al: Genetic data and electronic health records: a discussion of ethical, logistical and technological considerations. J Am Med Inform Assoc. 2013, 21: 171-80.PubMedCentralCrossRefPubMed
6.
go back to reference Lin Z, Owen AB, Altman RB: Genomic research and human subject privacy. Science (80-). 2004, 305: 183-10.1126/science.1095019.CrossRef Lin Z, Owen AB, Altman RB: Genomic research and human subject privacy. Science (80-). 2004, 305: 183-10.1126/science.1095019.CrossRef
8.
go back to reference Naveed M, Ayday E, Clayton EW, et al: Privacy and Security in the Genomic Era. arXiv. 2014, 1405.1891v: 1-47. Naveed M, Ayday E, Clayton EW, et al: Privacy and Security in the Genomic Era. arXiv. 2014, 1405.1891v: 1-47.
9.
go back to reference Gymrek M, McGuire AL, Golan D, et al: Identifying personal genomes by surname inference. Science (80-). 2013, 339: 321-4. 10.1126/science.1229566.CrossRef Gymrek M, McGuire AL, Golan D, et al: Identifying personal genomes by surname inference. Science (80-). 2013, 339: 321-4. 10.1126/science.1229566.CrossRef
10.
go back to reference Homer N, Szelinger S, Redman M, et al: Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet. 2008, 4: e1000167-10.1371/journal.pgen.1000167.PubMedCentralCrossRefPubMed Homer N, Szelinger S, Redman M, et al: Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet. 2008, 4: e1000167-10.1371/journal.pgen.1000167.PubMedCentralCrossRefPubMed
11.
14.
go back to reference Kantarcioglu M, Jiang W, Liu Y, et al: A cryptographic approach to securely share and query genomic sequences. IEEE Trans Inf Technol Biomed. 2008, 12: 606-17.CrossRefPubMed Kantarcioglu M, Jiang W, Liu Y, et al: A cryptographic approach to securely share and query genomic sequences. IEEE Trans Inf Technol Biomed. 2008, 12: 606-17.CrossRefPubMed
15.
go back to reference Kamm L, Bogdanov D, Laur S, et al: A new way to protect privacy in large-scale genome-wide association studies. Bioinformatics. 2013, 29: 886-93. 10.1093/bioinformatics/btt066.PubMedCentralCrossRefPubMed Kamm L, Bogdanov D, Laur S, et al: A new way to protect privacy in large-scale genome-wide association studies. Bioinformatics. 2013, 29: 886-93. 10.1093/bioinformatics/btt066.PubMedCentralCrossRefPubMed
16.
go back to reference Baldi P, Baronio R, Cristofaro E De: Countering gattaca: efficient and secure testing of fully-sequenced human genomes. CCS '11 Proceedings of the 18th ACM conference on Computer and communications security. 2011, 691-702. Baldi P, Baronio R, Cristofaro E De: Countering gattaca: efficient and secure testing of fully-sequenced human genomes. CCS '11 Proceedings of the 18th ACM conference on Computer and communications security. 2011, 691-702.
17.
go back to reference B EA, Raisaro JL, Hengartner U, et al: Data Privacy Management and Autonomous Spontaneous Security. 2014, Berlin, Heidelberg: : Springer Berlin Heidelberg B EA, Raisaro JL, Hengartner U, et al: Data Privacy Management and Autonomous Spontaneous Security. 2014, Berlin, Heidelberg: : Springer Berlin Heidelberg
18.
go back to reference Agrawal R, Kiernan J, Srikant R, et al: Order preserving encryption for numeric data. Proceedings of the 2004 ACM SIGMOD international conference on Management of data - SIGMOD '04. 2004, New York, USA: ACM Press, 563-CrossRef Agrawal R, Kiernan J, Srikant R, et al: Order preserving encryption for numeric data. Proceedings of the 2004 ACM SIGMOD international conference on Management of data - SIGMOD '04. 2004, New York, USA: ACM Press, 563-CrossRef
19.
go back to reference Agrawal R, Srikant R: Privacy-preserving data mining. Proceedings of the 2000 ACM SIGMOD international conference on Management of data - SIGMOD '00. 2000, New York, USA: ACM Press, 439-50.CrossRef Agrawal R, Srikant R: Privacy-preserving data mining. Proceedings of the 2000 ACM SIGMOD international conference on Management of data - SIGMOD '00. 2000, New York, USA: ACM Press, 439-50.CrossRef
20.
go back to reference Dwork C: Differential privacy. Int Colloq Autom Lang Program. 2006, 4052: 1-12. Dwork C: Differential privacy. Int Colloq Autom Lang Program. 2006, 4052: 1-12.
21.
go back to reference Jiang X, Sarwate AD, Ohno-Machado L: Privacy technology to support data sharing for comparative effectiveness research: a systematic review. Med Care. 2013, 51: S58-65.PubMedCentralCrossRefPubMed Jiang X, Sarwate AD, Ohno-Machado L: Privacy technology to support data sharing for comparative effectiveness research: a systematic review. Med Care. 2013, 51: S58-65.PubMedCentralCrossRefPubMed
24.
go back to reference Dwork C, McSherry F, Nissim K, et al: Calibrating noise to sensitivity in private data analysis. Theory Cryptogr. 2006, 3876: 265-84. 10.1007/11681878_14.CrossRef Dwork C, McSherry F, Nissim K, et al: Calibrating noise to sensitivity in private data analysis. Theory Cryptogr. 2006, 3876: 265-84. 10.1007/11681878_14.CrossRef
25.
go back to reference McSherry F, Talwar K: Mechanism Design via Differential Privacy. 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07). 2007, Providence, RI: : IEEE, 94-103.CrossRef McSherry F, Talwar K: Mechanism Design via Differential Privacy. 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07). 2007, Providence, RI: : IEEE, 94-103.CrossRef
26.
go back to reference Chernoff H, Lehmann EL, et al: The use of maximum likelihood estimates in χ^2 tests for goodness of fit. Ann Math Stat. 1954, 25: 579-86. 10.1214/aoms/1177728726.CrossRef Chernoff H, Lehmann EL, et al: The use of maximum likelihood estimates in χ^2 tests for goodness of fit. Ann Math Stat. 1954, 25: 579-86. 10.1214/aoms/1177728726.CrossRef
27.
go back to reference Barrett JC, Fry B, Maller J, et al: Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 2005, 21: 263-5. 10.1093/bioinformatics/bth457.CrossRefPubMed Barrett JC, Fry B, Maller J, et al: Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 2005, 21: 263-5. 10.1093/bioinformatics/bth457.CrossRefPubMed
28.
go back to reference Scheet P, Stephens M: A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet. 2006, 78: 629-44. 10.1086/502802.PubMedCentralCrossRefPubMed Scheet P, Stephens M: A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet. 2006, 78: 629-44. 10.1086/502802.PubMedCentralCrossRefPubMed
29.
go back to reference Wang S, Mohammed N, Chen R: Differentially Private Genome Data Dissemination through Top-Down Specialization. BMC Med informatics Decis Mak. 2014, 14 (S1): S2-CrossRef Wang S, Mohammed N, Chen R: Differentially Private Genome Data Dissemination through Top-Down Specialization. BMC Med informatics Decis Mak. 2014, 14 (S1): S2-CrossRef
30.
go back to reference Yu F, Ji Z: Scalable Privacy-Preserving Data Sharing Methodology for Genome-Wide Association Studies: An Application to iDASH Healthcare Privacy Protection Challenge. BMC Med Informatics Decis Mak. 2014, 14 (S1): S3-CrossRef Yu F, Ji Z: Scalable Privacy-Preserving Data Sharing Methodology for Genome-Wide Association Studies: An Application to iDASH Healthcare Privacy Protection Challenge. BMC Med Informatics Decis Mak. 2014, 14 (S1): S3-CrossRef
31.
go back to reference Ohno-Machado L, Bafna V, Boxwala Aa, et al: iDASH. Integrating data for analysis, anonymization, and sharing. J Am Med Informatics Assoc. 2012, 19: 196-201. 10.1136/amiajnl-2011-000538.CrossRef Ohno-Machado L, Bafna V, Boxwala Aa, et al: iDASH. Integrating data for analysis, anonymization, and sharing. J Am Med Informatics Assoc. 2012, 19: 196-201. 10.1136/amiajnl-2011-000538.CrossRef
Metadata
Title
A community assessment of privacy preserving techniques for human genomes
Authors
Xiaoqian Jiang
Yongan Zhao
Xiaofeng Wang
Bradley Malin
Shuang Wang
Lucila Ohno-Machado
Haixu Tang
Publication date
01-12-2014
Publisher
BioMed Central
DOI
https://doi.org/10.1186/1472-6947-14-S1-S1