Abstract
The publication of transaction data, such as market basket data, medical records, and query logs, serves the public benefit. Mining such data allows for the derivation of association rules that connect certain items to others with measurable confidence. Still, this type of data analysis poses a privacy threat; an adversary having partial information on a person's behavior may confidently associate that person to an item deemed to be sensitive. Ideally, an anonymization of such data should lead to an inference-proof version that prevents the association of individuals to sensitive items, while otherwise allowing for truthful associations to be derived. Original approaches to this problem were based on value perturbation, damaging data integrity. Recently, value generalization has been proposed as an alternative; still, approaches based on it have assumed either that all items are equally sensitive, or that some are sensitive and can be known to an adversary only by association, while others are non-sensitive and can be known directly. Yet in reality there is a distinction between sensitive and non-sensitive items, but an adversary may possess information on any of them. Most critically, no antecedent method aims at a clear inference-proof privacy guarantee. In this paper, we propose ρ-uncertainty, the first, to our knowledge, privacy concept that inherently safeguards against sensitive associations without constraining the nature of an adversary's knowledge and without falsifying data. The problem of achieving ρ-uncertainty with low information loss is challenging because it is natural. A trivial solution is to suppress all sensitive items. We develop more sophisticated schemes. In a broad experimental study, we show that the problem is solved non-trivially by a technique that combines generalization and suppression, which also achieves favorable results compared to a baseline perturbation-based scheme.
- E. Adar, D. S. Weld, B. N. Bershad, and S. S. Gribble. Why we search: visualizing and predicting user behavior. In WWW, 2007. Google ScholarDigital Library
- R. Agrawal, T. Imieliński, and A. Swami. Mining association rules between sets of items in large databases. In SIGMOD, 1993. Google ScholarDigital Library
- R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In VLDB, 1994. Google ScholarDigital Library
- S. Agrawal, J. R. Haritsa, and B. A. Prakash. FRAPP: A framework for high-accuracy privacy-preserving mining. Data Min. Knowl. Discov., 18(1):101--139, 2009. Google ScholarDigital Library
- A. Amir, R. Feldman, and R. Kashi. A new and versatile method for association generation. Information Systems, 22(6--7):333--347, 1997. Google ScholarDigital Library
- R. J. Bayardo, Jr. Efficiently mining long patterns from databases. In SIGMOD, 1998. Google ScholarDigital Library
- H. Cui, J.-R. Wen, J.-Y. Nie, and W.-Y. Ma. Probabilistic query expansion using query logs. In WWW, 2002. Google ScholarDigital Library
- E. Dasseni, V. S. Verykios, A. K. Elmagarmid, and E. Bertino. Hiding association rules by using confidence and support. In IHW, 2001. Google ScholarDigital Library
- A. Evfimievski, J. Gehrke, and R. Srikant. Limiting privacy breaches in privacy preserving data mining. In PODS, 2003. Google ScholarDigital Library
- A. Evfimievski, R. Srikant, R. Agrawal, and J. Gehrke. Privacy preserving mining of association rules. In KDD, 2002. Google ScholarDigital Library
- G. Ghinita, P. Karras, P. Kalnis, and N. Mamoulis. A framework for efficient data anonymization under privacy and accuracy constraints. ACM TODS, 34(2):1--47, 2009. Google ScholarDigital Library
- G. Ghinita, Y. Tao, and P. Kalnis. On the anonymization of sparse high-dimensional data. In ICDE, 2008. Google ScholarDigital Library
- J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In SIGMOD, 2000. Google ScholarDigital Library
- Y. He and J. F. Naughton. Anonymization of set-valued data via top-down, local generalization. PVLDB, 2(1):934--945, 2009. Google ScholarDigital Library
- V. S. Iyengar. Transforming data to satisfy privacy constraints. In KDD, 2002. Google ScholarDigital Library
- D. Kifer. Attacks on privacy and deFinetti's theorem. In SIGMOD, 2009. Google ScholarDigital Library
- A. Machanavajjhala, D. Kifer, J. Gehrke, and M. Venkitasubramaniam. l-diversity: Privacy beyond k-anonymity. ACM TKDD, 1(1):3, 2007. Google ScholarDigital Library
- S. J. Rizvi and J. R. Haritsa. Maintaining data privacy in association rule mining. In VLDB, 2002. Google ScholarDigital Library
- P. Samarati. Protecting respondents' identities in microdata release. IEEE TKDE, 13(6):1010--1027, 2001. Google ScholarDigital Library
- A. Savasere, E. Omiecinski, and S. B. Navathe. An efficient algorithm for mining association rules in large databases. In VLDB, 1995. Google ScholarDigital Library
- Y. Saygin, V. S. Verykios, and C. Clifton. Using unknowns to prevent discovery of association rules. SIGMOD Rec., 30(4):45--54, 2001. Google ScholarDigital Library
- R. Srikant and R. Agrawal. Mining generalized association rules. In VLDB, 1995. Google ScholarDigital Library
- M. Terrovitis, N. Mamoulis, and P. Kalnis. Privacy-preserving anonymization of set-valued data. PVLDB, 1(1):115--125, 2008. Google ScholarDigital Library
- V. S. Verykios, A. K. Elmagarmid, E. Bertino, Y. Saygin, and E. Dasseni. Association rule hiding. IEEE TKDE, 16(4):434--447, 2004. Google ScholarDigital Library
- K. Wang, Y. Xu, A. W. C. Fu, and R. C. W. Wong. FF-anonymity: When quasi-identifiers are missing. In ICDE, 2009. Google ScholarDigital Library
- R. C.-W. Wong, A. W.-C. Fu, K. Wang, and J. Pei. Minimality attack in privacy preserving data publishing. In VLDB, 2007. Google ScholarDigital Library
- Y.-H. Wu, C.-M. Chiang, and A. L. Chen. Hiding sensitive association rules with limited side effects. IEEE TKDE, 19(1):29--42, 2007. Google ScholarCross Ref
- X. Xiao and Y. Tao. Anatomy: simple and effective privacy preservation. In VLDB, 2006. Google ScholarDigital Library
- X. Xiao and Y. Tao. Personalized privacy preservation. In SIGMOD, 2006. Google ScholarDigital Library
- Y. Xu, K. Wang, A. W.-C. Fu, and P. S. Yu. Anonymizing transaction databases for publication. In KDD, 2008. Google ScholarDigital Library
- G. Yang. The complexity of mining maximal frequent itemsets and maximal frequent patterns. In KDD, 2004. Google ScholarDigital Library
- M. J. Zaki. Scalable algorithms for association mining. IEEE TKDE, 12(3):372--390, 2000. Google ScholarDigital Library
- Z. Zheng, R. Kohavi, and L. Mason. Real world performance of association rule algorithms. In KDD, 2001. Google ScholarDigital Library
Index Terms
- ρ-uncertainty: inference-proof transaction anonymization
Recommendations
ρ: Relaxed Hierarchical ORAM
ASPLOS '19: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating SystemsApplications in the cloud are vulnerable to several attack scenarios. In one possibility, an untrusted cloud operator can examine addresses on the memory bus and use this information leak to violate privacy guarantees, even if data is encrypted. The ...
Achieving Guaranteed Anonymity in GPS Traces via Uncertainty-Aware Path Cloaking
The integration of Global Positioning System (GPS) receivers and sensors into mobile devices has enabled collaborative sensing applications, which monitor the dynamics of environments through opportunistic collection of data from many users' devices. ...
Uncertainty-Aware Personal Assistant for Making Personalized Privacy Decisions
Many software systems, such as online social networks, enable users to share information about themselves. Although the action of sharing is simple, it requires an elaborate thought process on privacy: what to share, with whom to share, and for what ...
Comments