Skip to main content
Log in

A secure distributed framework for achieving k-anonymity

  • Special Issue Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

k-anonymity provides a measure of privacy protection by preventing re-identification of data to fewer than a group of k data items. While algorithms exist for producing k-anonymous data, the model has been that of a single source wanting to publish data. Due to privacy issues, it is common that data from different sites cannot be shared directly. Therefore, this paper presents a two-party framework along with an application that generates k-anonymous data from two vertically partitioned sources without disclosing data from one site to the other. The framework is privacy preserving in the sense that it satisfies the secure definition commonly defined in the literature of Secure Multiparty Computation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Agrawal, R., Evfimievski, A., Srikant, R.: Information sharing across private databases. In: Proceedings of ACM SIGMOD international conference on Management of Data. San Diego, California, 9–12 June 2003. http://doi.acm.org/10.1145/872757.872771

  2. Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proceedings of the 2000 ACM SIGMOD Conference on Management of Data. pp. 439–450. ACM Dallas, TX, 14–19 May 2000, http://doi.acm.org/10.1145/342009.335438

  3. Bayardo, R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: Proceedings of the IEEE International Conference on Data Engineering. Tokyo, Japan, 5–8 2005 April

  4. Benaloh, J.C.: Secret sharing homomorphisms: Keeping shares of a secret. In: Advances in Cryptography, CRYPTO86: Proceedings, Odlyzko, A. (ed.) vol. 263. pp. 251–260. Springer-Verlag, Lecture Notes in Computer Science, 1986, http://springerlink.metapress.com/openurl.asp?genre=article&issn=0302-9743&volume=263&spage=251

  5. Blake, C., Merz, C.: UCI repository of machine learning databases. 1998. http://www.ics.uci.edu/~mlearn/ MLRepository.html

  6. Boudot, F., Schoenmakers, B., Traore, J.: “A fair and efficient solution to the socialist millionaires’ problem. Discrete Appl. Math. 111 (1-2), 23–36 2001. http://www.win.tue.nl/ ~berry/papers/dam.pdf

  7. Dobkin, D., Jones, A.K., Lipton, R.J.: Secure databases: Protection against user influence. ACM Trans. Database Syst 4(1), 97–106 1979. http://doi.acm.org/10.1145/320064.320068

  8. Fischlin, M.: A cost-effective pay-per-multiplication comparison method for millionaires. RSA Secu 2001 Cryptographer’s Track 2020(1–2), 457–471, 2001.http://www.mi.informatik. uni-frankfurt.de/research/papers/fischlin.millionaire.2001.pdf

  9. Freedman, M.J., Nissim, K., Pinkas, B.: Efficient private matching and set intersection. In: Eurocrypt 2004. International Association for Cryptologic Research (IACR), Interlaken, Switzerland: 2–6 2004 May

  10. Goethals, B., Laur, S., Lipmaa, H., Mielikainen, T.: On secure scalar product computation for privacy-preserving data mining. In: Park, C., Chee, S., (eds.), The 7th Annual International Conference in Information Security and Cryptology (ICISC 2004), Seoul, Korea, 2–3 2004 December

  11. Goldreich, O., Micali, S., Wigderson, A.: How to play any mental game - a completeness theorem for protocols with honest majority. In: 19th ACM Symposium on the Theory of Computing, 1987, pp. 218–229. http://doi.acm.org/10.1145/28395.28420

  12. Goldreich, O.: The Foundations of Cryptography.Cambridge University Press, 2004, vol. 2, ch. General Cryptographic Protocols. http://www.wisdom.weizmann.ac.il/~oded/ PSBookFrag/prot.ps

  13. Goldwasser S., Micali S. (1984) Probabilistic encryption. J. Comput. Syst. Sci. 28(2): 270–299

    Article  MATH  MathSciNet  Google Scholar 

  14. Goldwasser, S., Micali, S., Rackoff, C.: The knowledge complexity of interactive proof systems. In: Proceedings of the 17th Annual ACM Symposium on Theory of Computing (STOC’85), pp. 291–304 Providence, Rhode Island, USA, 6–8 1985 May

  15. Hundepool, A., Willenborg, L.: μ- and τ-argus: software for statistical disclosure control. In: Third International Seminar on Statistical Confidentiality (1996)

  16. Ioannidis, I., Grama, A.: An efficient protocol for yao’s millionaires’ problem. In: Hawaii International Conference on System Sciences (HICSS-36), pp. 205–210. 2003 Waikoloa Village, Hawaii, 6–9 2003 January

  17. Iyengar, V.S.: Transforming data to satisfy privacy constraints, In: Proceedings of the 2002 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 23–26 Edmonton, Alberta, Canada, 2002

  18. Jiang, W., Clifton, C.: Privacy-preserving distributed k-anonymity. In: Proceedings of the 19th Annual IFIP WG 11.3 Working Conference on Database and Applications Security, Storrs, Connecticut, 7–10 2005 August

  19. LeFevre, K., DeWitt, D., Ramakrishnan, R.: Incognito: Efficient full-domain k-anonymity. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, Baltimore, MD, 13–16 2005 June

  20. Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: Proceedings of the 23rd ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS 2004).: ACM Press, Paris, France 14–16 2004 June

  21. Moore, R.A., Jr., Controlled data-swapping techniques for masking public use microdata sets. U.S. Bureau of the Census, Washington, DC., Statistical Research Division Report Series RR 96-04, 1996. http://www.census.gov/srd/papers/pdf/rr96-4.pdf

  22. Naccache, D., Stern, J.: A new public key cryptosystem based on higher residues. In: Proceedings of the 5th ACM conference on Computer and communications security. pp. 59–66. ACM Press, San Francisco, California, United States (1998)

  23. Naor, M., Pinkas, B.: Oblivious transfer and polynomial evaluation In: Proceedings of the Thirty-first Annual ACM Symposium on Theory of Computing. pp. 245–254. ACM Press, Atlanta, Georgia, United States: (1999)

  24. Okamoto, T., Uchiyama, S.: A new public-key cryptosystem as secure as factoring. In:Advances in Cryptology - Eurocrypt’98, LNCS 1403. pp. 308–318. Springer, Berlin Heidelberg Newyork(1998)

  25. Rabin, M.: How to exchange secrets by oblivious transfer. Aiken Computation Laboratory, Harvard University, Tech. Rep. TR-81, 1981

  26. Samarati P. (2001) Protecting respondent’s privacy in microdata release. IEEE Trans. Knowl. Data. Eng. 13(6): 1010–1027

    Article  Google Scholar 

  27. Schadow, G., Grannis, S.J., McDonald, C.J.: Privacy- preserving distributed queries for a clinical case research network. In: Clifton, C., Estivill-Castro, V.: (eds), IEEE International Conference on Data Mining Workshop on Privacy, Security, and Data Mining, vol. 14. pp. 55–65. Australian Computer Society, Maebashi City, Japan: 9 2002 December, http://crpit.com/Vol14.html

  28. Sweeney, L.: Guaranteeing anonymity when sharing medical data, the datafly system. Proc. J. Am. Med. Inform. Assoc. (1997)

  29. Sweeney, L.: Computational disclosure control: A primer on data privacy protection. Ph.D. dissertation, Massachusetts Institute of Technology, (2001)

  30. Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. Int J Uncertainty, Fuzziness and Knowledge-based Syst 10(5), 571–588, 2002. http://privacy.cs.cmu.edu/dataprivacy/ projects/kanonymity/ kanonymity2.html

    Google Scholar 

  31. Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertainty, Fuzziness and Knowledge-based Syst. 10(5), 557–570 2002. http://privacy.cs.cmu. edu/dataprivacy/projects/kanonymity/kanonymity.html

  32. Vaidya, J.: Privacy preserving data mining over vertically partitioned data. Ph.D. dissertation, Purdue University, West Lafayette, Indiana, 2004. http://www.cs. purdue.edu/homes/jsvaidya/thesis.pdf

  33. Vaidya J., Clifton C. (2005) Secure set intersection cardinality with application to association rule mining. J. Comput. Sec. 13(4): 593–622

    Google Scholar 

  34. Wright, R.N., Yang, Z.: Privacy-preserving bayesian network structure computation on distributed heterogeneous data. In: Proceedings of the 2004 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, WA, 22–25 2004 August

  35. Yao, A.C.: Protocols for secure computation. In: Proceedings of the 23rd IEEE Symposium on Foundations of Computer Science. pp. 160–164. IEEE, 1982

  36. Yao, A.C.: How to generate and exchange secrets. In: Proceedings of the 27th IEEE Symposium on Foundations of Computer Science. pp. 162–167. IEEE, 1986

  37. Zhong, S., Yang, Z., Wright, R.N.: Privacy-enhancing k-anonymization of customer data. In: Proceedings of the 24rd ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS 2005). ACM Press, Baltimore, Maryland, 13–16 2005 June

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Jiang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, W., Clifton, C. A secure distributed framework for achieving k-anonymity. The VLDB Journal 15, 316–333 (2006). https://doi.org/10.1007/s00778-006-0008-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-006-0008-z

Keywords

Navigation