skip to main content
research-article

ρ-uncertainty: inference-proof transaction anonymization

Published:01 September 2010Publication History
Skip Abstract Section

Abstract

The publication of transaction data, such as market basket data, medical records, and query logs, serves the public benefit. Mining such data allows for the derivation of association rules that connect certain items to others with measurable confidence. Still, this type of data analysis poses a privacy threat; an adversary having partial information on a person's behavior may confidently associate that person to an item deemed to be sensitive. Ideally, an anonymization of such data should lead to an inference-proof version that prevents the association of individuals to sensitive items, while otherwise allowing for truthful associations to be derived. Original approaches to this problem were based on value perturbation, damaging data integrity. Recently, value generalization has been proposed as an alternative; still, approaches based on it have assumed either that all items are equally sensitive, or that some are sensitive and can be known to an adversary only by association, while others are non-sensitive and can be known directly. Yet in reality there is a distinction between sensitive and non-sensitive items, but an adversary may possess information on any of them. Most critically, no antecedent method aims at a clear inference-proof privacy guarantee. In this paper, we propose ρ-uncertainty, the first, to our knowledge, privacy concept that inherently safeguards against sensitive associations without constraining the nature of an adversary's knowledge and without falsifying data. The problem of achieving ρ-uncertainty with low information loss is challenging because it is natural. A trivial solution is to suppress all sensitive items. We develop more sophisticated schemes. In a broad experimental study, we show that the problem is solved non-trivially by a technique that combines generalization and suppression, which also achieves favorable results compared to a baseline perturbation-based scheme.

References

  1. E. Adar, D. S. Weld, B. N. Bershad, and S. S. Gribble. Why we search: visualizing and predicting user behavior. In WWW, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Agrawal, T. Imieliński, and A. Swami. Mining association rules between sets of items in large databases. In SIGMOD, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In VLDB, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Agrawal, J. R. Haritsa, and B. A. Prakash. FRAPP: A framework for high-accuracy privacy-preserving mining. Data Min. Knowl. Discov., 18(1):101--139, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Amir, R. Feldman, and R. Kashi. A new and versatile method for association generation. Information Systems, 22(6--7):333--347, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. J. Bayardo, Jr. Efficiently mining long patterns from databases. In SIGMOD, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. H. Cui, J.-R. Wen, J.-Y. Nie, and W.-Y. Ma. Probabilistic query expansion using query logs. In WWW, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. E. Dasseni, V. S. Verykios, A. K. Elmagarmid, and E. Bertino. Hiding association rules by using confidence and support. In IHW, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Evfimievski, J. Gehrke, and R. Srikant. Limiting privacy breaches in privacy preserving data mining. In PODS, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Evfimievski, R. Srikant, R. Agrawal, and J. Gehrke. Privacy preserving mining of association rules. In KDD, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. G. Ghinita, P. Karras, P. Kalnis, and N. Mamoulis. A framework for efficient data anonymization under privacy and accuracy constraints. ACM TODS, 34(2):1--47, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. G. Ghinita, Y. Tao, and P. Kalnis. On the anonymization of sparse high-dimensional data. In ICDE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In SIGMOD, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Y. He and J. F. Naughton. Anonymization of set-valued data via top-down, local generalization. PVLDB, 2(1):934--945, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. V. S. Iyengar. Transforming data to satisfy privacy constraints. In KDD, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. Kifer. Attacks on privacy and deFinetti's theorem. In SIGMOD, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Machanavajjhala, D. Kifer, J. Gehrke, and M. Venkitasubramaniam. l-diversity: Privacy beyond k-anonymity. ACM TKDD, 1(1):3, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. J. Rizvi and J. R. Haritsa. Maintaining data privacy in association rule mining. In VLDB, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. P. Samarati. Protecting respondents' identities in microdata release. IEEE TKDE, 13(6):1010--1027, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Savasere, E. Omiecinski, and S. B. Navathe. An efficient algorithm for mining association rules in large databases. In VLDB, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Y. Saygin, V. S. Verykios, and C. Clifton. Using unknowns to prevent discovery of association rules. SIGMOD Rec., 30(4):45--54, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. Srikant and R. Agrawal. Mining generalized association rules. In VLDB, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Terrovitis, N. Mamoulis, and P. Kalnis. Privacy-preserving anonymization of set-valued data. PVLDB, 1(1):115--125, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. V. S. Verykios, A. K. Elmagarmid, E. Bertino, Y. Saygin, and E. Dasseni. Association rule hiding. IEEE TKDE, 16(4):434--447, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. K. Wang, Y. Xu, A. W. C. Fu, and R. C. W. Wong. FF-anonymity: When quasi-identifiers are missing. In ICDE, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. R. C.-W. Wong, A. W.-C. Fu, K. Wang, and J. Pei. Minimality attack in privacy preserving data publishing. In VLDB, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Y.-H. Wu, C.-M. Chiang, and A. L. Chen. Hiding sensitive association rules with limited side effects. IEEE TKDE, 19(1):29--42, 2007. Google ScholarGoogle ScholarCross RefCross Ref
  28. X. Xiao and Y. Tao. Anatomy: simple and effective privacy preservation. In VLDB, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. X. Xiao and Y. Tao. Personalized privacy preservation. In SIGMOD, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Y. Xu, K. Wang, A. W.-C. Fu, and P. S. Yu. Anonymizing transaction databases for publication. In KDD, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. G. Yang. The complexity of mining maximal frequent itemsets and maximal frequent patterns. In KDD, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M. J. Zaki. Scalable algorithms for association mining. IEEE TKDE, 12(3):372--390, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Z. Zheng, R. Kohavi, and L. Mason. Real world performance of association rule algorithms. In KDD, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. ρ-uncertainty: inference-proof transaction anonymization
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image Proceedings of the VLDB Endowment
          Proceedings of the VLDB Endowment  Volume 3, Issue 1-2
          September 2010
          1658 pages

          Publisher

          VLDB Endowment

          Publication History

          • Published: 1 September 2010
          Published in pvldb Volume 3, Issue 1-2

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader