Article

Hiding the presence of individuals from shared databases

Authors:
Mehmet Ercan Nergiz

Purdue University, West Lafayette, IN

Purdue University, West Lafayette, IN
View Profile

,
Maurizio Atzori

ISTI-CNR, Pisa, Italy

ISTI-CNR, Pisa, Italy
View Profile

,
Chris Clifton

Purdue University, West Lafayette, IN

Purdue University, West Lafayette, IN
View Profile

SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of dataJune 2007Pages 665–676https://doi.org/10.1145/1247480.1247554

Published:11 June 2007Publication History

SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data

Pages 665–676

ABSTRACT

Advances in information technology, and its use in research, are increasing both the need for anonymized data and the risks of poor anonymization. We present a metric, δ-presence, that clearly links the quality of anonymization to the risk posed by inadequate anonymization. We show that existing anonymization techniques are inappropriate for situations where δ-presence is a good metric (specifically, where knowing an individual is in the database poses a privacy risk), and present algorithms for effectively anonymizing to meet δ-presence. The algorithms are evaluated in the context of a real-world scenario, demonstrating practical applicability of the approach.

References

A. D. Association. Direct and indirect costs of diabetes in the United States, 2006. http://www.diabetes.org/diabetes-statistics/cost-of-diabetes-in-us.jspGoogle Scholar
C. C. Agrawal. On k-anonymity and the curse of dimensionality. In Proceedings of the 31st international conference on Very large data bases, pp. 901--909, Trondheim, Norway, 2005. Google ScholarDigital Library
G. Agrawal, T. Feder, K. Kenthapadi, S. Khuller,R. Panigrahy, D. Thomas., A. Zhu, Achieving anonymity via clustering. In: PODS '06: Proc. of the 25th ACMSIGMOD-SIGACT-SIGART symposium on Principles of database systems, Chicago, IL, USA, 2006. Google ScholarDigital Library
M. Atzori. Weak k-anonymity: A low-distortion model for protecting privacy. In Proceedings of the 8th International Information Security Conference (ISC06), pages 60--71,2006. Google ScholarDigital Library
R. Bayardo and R. Agrawal. Data privacy through optimalk-anonymization. In Proc. of the 21st Int'l Conf. on Data Engineering, 2005. Google ScholarDigital Library
C. Blake and C. Merz. UCI repository of machine learning databases, 1998.Google Scholar
Standard for privacy of individually identifiable health information. Federal Register, 67(157):53181--53273, Aug.14 2002.Google Scholar
A. Ohrn and L. Ohno-Machado. Using boolean reasoning to anonymize databases. Artificial Intelligence in Medicine, 15(3):235--254, Mar. 1999.Google ScholarCross Ref
V. Iyengar. Transforming data to satisfy privacy constraints. In Proc., the Eigth ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining, pages 279--288, 2002. Google ScholarDigital Library
K. LeFevre, D. DeWitt, and R. Ramakrishnan. Incognito: Efficient full-domain k-anonymity. In Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, Baltimore, MD, June 13--16 2005. Google ScholarDigital Library
K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Mondrian multidimensional k--anonymity. In Proceedings of the 22ndInternational Conference on Data Engineering (ICDE '06), pages 25--35, Atlanta, GA, Apr. 3--7 2006. Google ScholarDigital Library
A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkitasubramaniam. l-diversity: Privacy beyond k-anonymity. In Proceedings of the 22nd IEEE International Conference on Data Engineering (ICDE 2006), Atlanta, Georgia, Apr. 2006. Google ScholarDigital Library
National Institute of Diabetes and Digestive and Kidney Diseases. National diabetes statistics fact sheet: general information and national estimates on diabetes in the United States. Technical Report NIH Publication No. 06-3892, U.S. Department of Health and Human Services, National Institute of Health, Bethesda, MD, Nov. 2005.Google Scholar
M. E. Nergiz and C. Clifton. Thoughts on k-anonymization. In ICDEW '06: Proc. of the 22nd Int'l Conf. on Data Engineering Workshops, page 96, Washington, DC, USA, 2006. IEEE Computer Society. Google ScholarDigital Library
P. Samarati. Protecting respondent's privacy in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6):1010--1027, Nov./Dec. 2001. Google ScholarDigital Library
L. Sweeney. k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, (5):557--570, 2002. Google ScholarDigital Library
X. Xiao and Y. Tao. Anatomy: Simple and effective privacy preservation. In Proceedings of 32nd International Conference on Very Large Data Bases (VLDB 2006), Seoul, Korea, Sept. 12-15 2006. Google ScholarDigital Library

Index Terms

Hiding the presence of individuals from shared databases

Recommendations

δ-Presence without Complete World Knowledge

Advances in information technology, and its use in research, are increasing both the need for anonymized data and the risks of poor anonymization. In [CHECK END OF SENTENCE], we presented a new privacy metric, \delta-presence, that clearly links the ...
Read More
Identity disclosure protection: A data reconstruction approach for privacy-preserving data mining

Identity disclosure is one of the most serious privacy concerns in today's information age. A well-known method for protecting identity disclosure is k-anonymity. A dataset provides k-anonymity protection if the information for each individual in the ...
Read More
A polynomial-time approximation to optimal multivariate microaggregation

Microaggregation is a family of methods for statistical disclosure control (SDC) of microdata (records on individuals and/or companies), that is, for masking microdata so that they can be released without disclosing private information on the underlying ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data
June 2007
1210 pages
ISBN:9781595936868
DOI:10.1145/1247480
General Chairs:
Lizhu Zhou
Tsinghua University, China
,
Tok Wang Ling
National University of Singapore, Singapore
,
Program Chair:
Beng Chin Ooi
National University of Singapore, Singapore
Copyright © 2007 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 June 2007
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
delta presence
k-anonymity
medical databases
privacy
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate785of4,003submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 189
  Total Citations
  View Citations
- 350
  Total Downloads
- Downloads (Last 12 months)55
- Downloads (Last 6 weeks)10
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Hiding the presence of individuals from shared databases

SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data

ABSTRACT

References

Cited By

Index Terms

Recommendations

δ-Presence without Complete World Knowledge

Identity disclosure protection: A data reconstruction approach for privacy-preserving data mining

A polynomial-time approximation to optimal multivariate microaggregation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Hiding the presence of individuals from shared databases

SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data

ABSTRACT

References

Cited By

Index Terms

Recommendations

δ-Presence without Complete World Knowledge

Identity disclosure protection: A data reconstruction approach for privacy-preserving data mining

A polynomial-time approximation to optimal multivariate microaggregation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media