Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2014

Open Access 01-12-2014 | Research article

Scalable privacy-preserving data sharing methodology for genome-wide association studies: an application to iDASH healthcare privacy protection challenge

Authors: Fei Yu, Zhanglong Ji

Published in: BMC Medical Informatics and Decision Making | Special Issue 1/2014

Login to get access

Abstract

In response to the growing interest in genome-wide association study (GWAS) data privacy, the Integrating Data for Analysis, Anonymization and SHaring (iDASH) center organized the iDASH Healthcare Privacy Protection Challenge, with the aim of investigating the effectiveness of applying privacy-preserving methodologies to human genetic data. This paper is based on a submission to the iDASH Healthcare Privacy Protection Challenge. We apply privacy-preserving methods that are adapted from Uhler et al. 2013 and Yu et al. 2014 to the challenge's data and analyze the data utility after the data are perturbed by the privacy-preserving methods. Major contributions of this paper include new interpretation of the χ2 statistic in a GWAS setting and new results about the Hamming distance score, a key component for one of the privacy-preserving methods.
Appendix
Available only for authorised users
Literature
1.
go back to reference Nils Homer, Szelinger Szabolcs, Redman Margot, Duggan David, Tembe Waibhav, Muehling Jill, Pearson John, Stephan Dietrich, Nelson Stanley, Craig David: Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genetics. 2008, 4 (8): e1000167-10.1371/journal.pgen.1000167.CrossRef Nils Homer, Szelinger Szabolcs, Redman Margot, Duggan David, Tembe Waibhav, Muehling Jill, Pearson John, Stephan Dietrich, Nelson Stanley, Craig David: Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genetics. 2008, 4 (8): e1000167-10.1371/journal.pgen.1000167.CrossRef
2.
go back to reference Dwork Cynthia, McSherry Frank, Nissim Kobbi, Smith Adam: Calibrating noise to sensitivity in private data analysis. Theory of Cryptography. 2006, 1-20. Dwork Cynthia, McSherry Frank, Nissim Kobbi, Smith Adam: Calibrating noise to sensitivity in private data analysis. Theory of Cryptography. 2006, 1-20.
3.
go back to reference Uhler Caroline, Slavkovic Aleksandra, Fienberg Stephen: Privacy-preserving data sharing for genome-wide association studies. Journal of Privacy and Confidentiality. 2013, 5 (1): 137-166. Uhler Caroline, Slavkovic Aleksandra, Fienberg Stephen: Privacy-preserving data sharing for genome-wide association studies. Journal of Privacy and Confidentiality. 2013, 5 (1): 137-166.
4.
go back to reference Johnson Aaron, Shmatikov Vitaly: Privacy-preserving data exploration in genome-wide association studies. Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2013, 1079-1087.CrossRef Johnson Aaron, Shmatikov Vitaly: Privacy-preserving data exploration in genome-wide association studies. Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2013, 1079-1087.CrossRef
5.
go back to reference Yu Fei, Fienberg Stephen, Slavković Aleksandra, Uhler Caroline: Scalable privacy-preserving data sharing methodology for genome-wide association studies. Journal of biomedical informatics. 2014, 50C: 133-141.CrossRef Yu Fei, Fienberg Stephen, Slavković Aleksandra, Uhler Caroline: Scalable privacy-preserving data sharing methodology for genome-wide association studies. Journal of biomedical informatics. 2014, 50C: 133-141.CrossRef
6.
go back to reference Jiang Xiaoqian, Zhao Yongan, Wang Xiaofeng, Malin Bradley, Wang Shuang, Ohno-Machado Lucila, Tang Haixu: A community assessment of privacy preserving techniques on human genome data. BMC. 2014 Jiang Xiaoqian, Zhao Yongan, Wang Xiaofeng, Malin Bradley, Wang Shuang, Ohno-Machado Lucila, Tang Haixu: A community assessment of privacy preserving techniques on human genome data. BMC. 2014
7.
go back to reference McSherry Frank, Talwar Kunal: Mechanism Design via Differential Privacy. 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07). 2007, 94-103.CrossRef McSherry Frank, Talwar Kunal: Mechanism Design via Differential Privacy. 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07). 2007, 94-103.CrossRef
8.
go back to reference Bhaskar Raghav, Laxman Srivatsan, Smith Adam, Thakurta Abhradeep: Discovering frequent patterns in sensitive data. Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD '10. 2010, New York, New York, USA, ACM Press, 503-CrossRef Bhaskar Raghav, Laxman Srivatsan, Smith Adam, Thakurta Abhradeep: Discovering frequent patterns in sensitive data. Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD '10. 2010, New York, New York, USA, ACM Press, 503-CrossRef
Metadata
Title
Scalable privacy-preserving data sharing methodology for genome-wide association studies: an application to iDASH healthcare privacy protection challenge
Authors
Fei Yu
Zhanglong Ji
Publication date
01-12-2014
Publisher
BioMed Central
DOI
https://doi.org/10.1186/1472-6947-14-S1-S3