Article

Anonymizing sequential releases

Authors:
Ke Wang

Simon Fraser University, Burnaby, BC, Canada

Simon Fraser University, Burnaby, BC, Canada
View Profile

,
Benjamin C. M. Fung

Simon Fraser University, Burnaby, BC, Canada

Simon Fraser University, Burnaby, BC, Canada
View Profile

KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2006Pages 414–423https://doi.org/10.1145/1150402.1150449

Published:20 August 2006Publication History

KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 414–423

ABSTRACT

An organization makes a new release as new information become available, releases a tailored view for each data request, releases sensitive information and identifying information separately. The availability of related releases sharpens the identification of individuals by a global quasi-identifier consisting of attributes from related releases. Since it is not an option to anonymize previously released data, the current release must be anonymized to ensure that a global quasi-identifier is not effective for identification. In this paper, we study the sequential anonymization problem under this assumption. A key question is how to anonymize the current release so that it cannot be linked to previous releases yet remains useful for its own release purpose. We introduce the lossy join, a negative property in relational database design, as a way to hide the join relationship among releases, and propose a scalable and practical solution.

References

G. Aggarwal, T. Feder, K. Kenthapadi, R. Motwani, R. Panigrahy, D. Thomas, and A. Zhu. Anonymizing tables. In ICDT, 2005. Google ScholarDigital Library
R. Bayardo and R. Agrawal. Data privacy through optimal k-anonymization. In IEEE ICDE, pages 217--228, 2005. Google ScholarDigital Library
C. Clifton. Using sample size to limit exposure to data mining. Journal of Computer Security, 8(4):281--307, 2000. Google ScholarDigital Library
A. Deutsch and Y. Papakonstantinou. Privacy in database publishing. In ICDT, 2005. Google ScholarDigital Library
B. C. M. Fung, K. Wang, and P. S. Yu. Top-down specialization for information and privacy preservation. In IEEE ICDE, pages 205--216, April 2005. Google ScholarDigital Library
V. S. Iyengar. Transforming data to satisfy privacy constraints. In ACM SIGKDD, pages 279--288, 2002. Google ScholarDigital Library
D. Kifer and J. Gehrke. Injecting utility into anonymized datasets. In ACM SIGMOD, Chicago, IL, June 2006. Google ScholarDigital Library
K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Incognito: Efficient full-domain k-anonymity. In ACM SIGMOD, 2005. Google ScholarDigital Library
K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Mondrian multidimensional k-anonymity. In IEEE ICDE, 2006. Google ScholarDigital Library
A. Machanavajjhala, J. Gehrke, and D. Kifer. l-diversity: Privacy beyond k-anonymity. In IEEE ICDE, 2006. Google ScholarDigital Library
B. Malin and L. Sweeney. How to protect genomic data privacy in a distributed network. In Journal of Biomed Info, 37(3): 179--192, 2004. Google ScholarDigital Library
A. Meyerson and R. Williams. On the complexity of optimal k-anonymity. In PODS, 2004. Google ScholarDigital Library
G. Miklau and D. Suciu. A formal analysis of information disclosure in data exchange. In ACM SIGMOD, 2004. Google ScholarDigital Library
D. J. Newman, S. Hettich, C. L. Blake, and C. J. Merz. UCI repository of machine learning databases, 1998. http://www.ics.uci.edu/~mlearn/MLRepository.html.Google Scholar
J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993. Google ScholarDigital Library
P. Samarati. Protecting respondents' identities in microdata release. IEEE TKDE, 13(6):1010--1027, 2001. Google ScholarDigital Library
P. Samarati and L. Sweeney. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. In IEEE Symposium on Research in Security and Privacy, May 1998.Google Scholar
C. E. Shannon. A mathematical theory of communication. The Bell System Technical Journal, 27:379 and 623, 1948.Google ScholarCross Ref
L. Sweeney. k-Anonymity: a model for protecting privacy. In International Journal on Uncertanty, Fuzziness and Knowledge-based Systems, 10(5), pages 557--570, 2002. Google ScholarDigital Library
K. Wang, B. C. M. Fung, and G. Dong. Integrating private databases for data analysis. In IEEE ISI, May 2005. Google ScholarDigital Library
K. Wang, B. C. M. Fung, and P. S. Yu. Template-based privacy preservation in classification problems. In IEEE ICDM, pages 466--473, November 2005. Google ScholarDigital Library
K. Wang, B. C. M. Fung, and P. S. Yu. Handicapping attacker's confidence: An alternative to k-anonymization. Knowledge and Information Systems: An International Journal, 2006. Google ScholarDigital Library
K. Wang, P. S. Yu, and S. Chakraborty. Bottom-up generalization: A data mining solution to privacy protection. In IEEE ICDM, November 2004. Google ScholarDigital Library
R. C. W. Wong, J. Li., A. W. C. Fu, and K. Wang. (α,k)-anonymity: An enhanced k-anonymity model for privacy preserving data publishing. In ACM SIGKDD, 2006. Google ScholarDigital Library
X. Xiao and Y. Tao. Personalized privacy preservation. In ACM SIGMOD, June 2006. Google ScholarDigital Library
C. Yao, X. S. Wang, and S. Jajodia. Checking for k-anonymity violation by views. In VLDB, 2005. Google ScholarDigital Library

Index Terms

Anonymizing sequential releases

Recommendations

Privacy by diversity in sequential releases of databases

We study the problem of privacy preservation in sequential releases of databases. In that scenario, several releases of the same table are published over a period of time, where each release contains a different set of the table attributes, as dictated ...
Read More
Anonymizing sequential releases under arbitrary updates
EDBT '13: Proceedings of the Joint EDBT/ICDT 2013 Workshops

In today's global information society, governments, companies, public and private institutions and even individuals have to cope with growing demands for personal data publication from scientists, statisticians, journalists and many other data ...
Read More
Limiting disclosure of sensitive data in sequential releases of databases

Privacy Preserving Data Publishing (PPDP) is a research field that deals with the development of methods to enable publishing of data while minimizing distortion, for maintaining usability on one hand, and respecting privacy on the other hand. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2006
986 pages
ISBN:1595933395
DOI:10.1145/1150402
Conference Chair:
Tina Eliassi-Rad
LLNL
,
General Chair:
Lyle Ungar
University of Pennsylvania
,
Program Chairs:
Mark Craven
University of Wisconsin
,
Dimitrios Gunopulos
University of California, Riverside
Copyright © 2006 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 August 2006
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
classification
generalization
k-anonymity
privacy
sequential release
Qualifiers
- Article
Conference
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 168
  Total Citations
  View Citations
- 1,304
  Total Downloads
- Downloads (Last 12 months)23
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Anonymizing sequential releases

KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Privacy by diversity in sequential releases of databases

Anonymizing sequential releases under arbitrary updates

Limiting disclosure of sensitive data in sequential releases of databases

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Anonymizing sequential releases

KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Privacy by diversity in sequential releases of databases

Anonymizing sequential releases under arbitrary updates

Limiting disclosure of sensitive data in sequential releases of databases

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media