Computer Science > Cryptography and Security
[Submitted on 12 Feb 2024]
Title:Privacy-Optimized Randomized Response for Sharing Multi-Attribute Data
View PDFAbstract:With the increasing amount of data in society, privacy concerns in data sharing have become widely recognized. Particularly, protecting personal attribute information is essential for a wide range of aims from crowdsourcing to realizing personalized medicine. Although various differentially private methods based on randomized response have been proposed for single attribute information or specific analysis purposes such as frequency estimation, there is a lack of studies on the mechanism for sharing individuals' multiple categorical information itself. The existing randomized response for sharing multi-attribute data uses the Kronecker product to perturb each attribute information in turn according to the respective privacy level but achieves only a weak privacy level for the entire dataset. Therefore, in this study, we propose a privacy-optimized randomized response that guarantees the strongest privacy in sharing multi-attribute data. Furthermore, we present an efficient heuristic algorithm for constructing a near-optimal mechanism. The time complexity of our algorithm is O(k^2), where k is the number of attributes, and it can be performed in about 1 second even for large datasets with k = 1,000. The experimental results demonstrate that both of our methods provide significantly stronger privacy guarantees for the entire dataset than the existing method. In addition, we show an analysis example using genome statistics to confirm that our methods can achieve less than half the output error compared with that of the existing method. Overall, this study is an important step toward trustworthy sharing and analysis of multi-attribute data. The Python implementation of our experiments and supplemental results are available at this https URL.
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.