skip to main content
10.1145/3589334.3648157acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article
Free Access

GraphLeak: Patient Record Leakage through Gradients with Knowledge Graph

Published:13 May 2024Publication History

ABSTRACT

In real clinics, the medical data are scattered over multiple hospitals. Due to security and privacy concerns, it is almost impossible to gather all the data together and train a unified model. Therefore, multi-node machine learning systems are currently the mainstream form of model training in healthcare systems. Nevertheless, distributed training relies on the exchange of gradients, which has been proved under the risk of privacy leakage. That means malicious attackers can restore the user's sensitive data by utilizing the publicly shared gradients, which is a serious problem for extremely private data such as Electronic Healthcare Records (EHRs). The performance of the previous gradient attack method will drop rapidly when the batch size of training data increases, which makes it less threatening in practice. However, in this paper, we found in the medical domain, by leveraging prior knowledge like the medical knowledge graph, the leakage risk can be significantly amplified. In particular, we present GraphLeak, which incorporates the medical knowledge graph in gradient leakage attacks. GraphLeak can improve the restoration effect of gradient attacks even under large batches of data. We conduct experimental verification on electronic healthcare record datasets, including eICU and MIMIC-III. Our method has achieved state-of-the-art attack performance compared with previous works. Code is available at https://github.com/anonymous4ai/GraphLeak.

Skip Supplemental Material Section

Supplemental Material

w4g0146.mp4

Supplemental video

mp4

58 MB

References

  1. Mislav Balunovic, Dimitar Dimitrov, Nikola Jovanovic, and Martin Vechev. 2022. Lamp: Extracting text from gradients with language model priors. Advances in Neural Information Processing Systems 35 (2022), 7641--7654.Google ScholarGoogle Scholar
  2. John F Banzhaf III. 1964. Weighted voting doesn't work: A mathematical analysis. Rutgers L. Rev. 19 (1964), 317.Google ScholarGoogle Scholar
  3. Alissa Brauneck, Louisa Schmalhorst, Mohammad Mahdi Kazemi Majdabadi, Mohammad Bakhtiari, Uwe Völker, Jan Baumbach, Linda Baumbach, and Gabriele Buchholtz. 2023. Federated machine learning, privacy-enhancing technologies, and data protection laws in medical research: Scoping review. Journal of Medical Internet Research 25 (2023), e41588.Google ScholarGoogle ScholarCross RefCross Ref
  4. Sen Cui, Jian Liang, Weishen Pan, Kun Chen, Changshui Zhang, and Fei Wang. 2022. Collaboration equilibrium in federated learning. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 241--251.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Jieren Deng, Yijue Wang, Ji Li, Chenghong Wang, Chao Shang, Hang Liu, Sanguthevar Rajasekaran, and Caiwen Ding. 2021. TAG: Gradient Attack on Transformer-based Language Models. In The 2021 Conference on Empirical Methods in Natural Language Processing.Google ScholarGoogle Scholar
  6. Jonas Geiping, Hartmut Bauermeister, Hannah Dröge, and Michael Moeller. 2020. Inverting gradients-how easy is it to break privacy in federated learning? Advances in Neural Information Processing Systems 33 (2020), 16937--16947.Google ScholarGoogle Scholar
  7. Michel Grabisch and Marc Roubens. 1999. An axiomatic approach to the concept of interaction among players in cooperative games. International Journal of game theory 28 (1999), 547--565.Google ScholarGoogle ScholarCross RefCross Ref
  8. Samyak Gupta, Yangsibo Huang, Zexuan Zhong, Tianyu Gao, Kai Li, and Danqi Chen. 2022. Recovering Private Text in Federated Learning of Language Models. In Advances in Neural Information Processing Systems, Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho (Eds.).Google ScholarGoogle Scholar
  9. Ali Hatamizadeh, Hongxu Yin, Pavlo Molchanov, Andriy Myronenko, Wenqi Li, Prerna Dogra, Andrew Feng, Mona Flores, Jan Kautz, Daguang Xu, et al . 2021. Towards Understanding the Risks of Gradient Inversion in Federated Learning. (2021).Google ScholarGoogle Scholar
  10. Ali Hatamizadeh, Hongxu Yin, Pavlo Molchanov, Andriy Myronenko, Wenqi Li, Prerna Dogra, Andrew Feng, Mona G Flores, Jan Kautz, Daguang Xu, et al. 2023. Do gradient inversion attacks make federated learning unsafe? IEEE Transactions on Medical Imaging (2023).Google ScholarGoogle Scholar
  11. Ali Hatamizadeh, Hongxu Yin, Holger R Roth, Wenqi Li, Jan Kautz, Daguang Xu, and Pavlo Molchanov. 2022. Gradvit: Gradient inversion of vision transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10021--10030.Google ScholarGoogle ScholarCross RefCross Ref
  12. Yangsibo Huang, Samyak Gupta, Zhao Song, Kai Li, and Sanjeev Arora. 2021. Evaluating gradient inversion attacks and defenses in federated learning. Advances in Neural Information Processing Systems 34 (2021), 7232--7241.Google ScholarGoogle Scholar
  13. Jinwoo Jeon, Kangwook Lee, Sewoong Oh, Jungseul Ok, et al . 2021. Gradient inversion with generative image prior. Advances in neural information processing systems 34 (2021), 29898--29908.Google ScholarGoogle Scholar
  14. Xiaoqi Jiao, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Linlin Li, Fang Wang, and Qun Liu. 2019. Tinybert: Distilling bert for natural language understanding. arXiv preprint arXiv:1909.10351 (2019).Google ScholarGoogle Scholar
  15. Xiao Jin, Pin-Yu Chen, Chia-Yi Hsu, Chia-Mu Yu, and Tianyi Chen. 2021. CAFE: Catastrophic data leakage in vertical federated learning. Advances in Neural Information Processing Systems 34 (2021), 994--1006.Google ScholarGoogle Scholar
  16. Alistair EW Johnson, Tom J Pollard, Lu Shen, Li-wei H Lehman, Mengling Feng, Mohammad Ghassemi, Benjamin Moody, Peter Szolovits, Leo Anthony Celi, and Roger G Mark. 2016. MIMIC-III, a freely accessible critical care database. Scientific data 3, 1 (2016), 1--9.Google ScholarGoogle Scholar
  17. Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL-HLT. 4171--4186.Google ScholarGoogle Scholar
  18. Tian Li, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith. 2020. Federated learning: Challenges, methods, and future directions. IEEE signal processing magazine 37, 3 (2020), 50--60.Google ScholarGoogle Scholar
  19. Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. 2020. Federated optimization in heterogeneous networks. Proceedings of Machine learning and systems 2 (2020), 429--450.Google ScholarGoogle Scholar
  20. Fenglin Liu, Xian Wu, Shen Ge, Wei Fan, and Yuexian Zou. 2021. Exploring and Distilling Posterior and Prior Knowledge for Radiology Report Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 13753--13762.Google ScholarGoogle ScholarCross RefCross Ref
  21. Fenglin Liu, Chenyu You, Xian Wu, Shen Ge, Sheng wang, and Xu Sun. 2021. Auto-Encoding Knowledge Graph for Unsupervised Medical Report Generation. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan (Eds.), Vol. 34. Curran Associates, Inc., 16266--16279. https://proceedings.neurips.cc/paper_files/paper/2021/file/876e1c59023b1a0e95808168e1a8ff89-Paper.pdfGoogle ScholarGoogle Scholar
  22. Jiahao Lu, Xi Sheryl Zhang, Tianli Zhao, Xiangyu He, and Jian Cheng. 2022. APRIL: Finding the Achilles' Heel on Privacy for Vision Transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 10051--10060.Google ScholarGoogle ScholarCross RefCross Ref
  23. Kailang Ma, Yu Sun, Jian Cui, Dawei Li, Zhenyu Guan, and Jianwei Liu. 2023. Instance-wise Batch Label Restoration via Gradients in Federated Learning. In The Eleventh International Conference on Learning Representations.Google ScholarGoogle Scholar
  24. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics. PMLR, 1273--1282.Google ScholarGoogle Scholar
  25. John X Morris, Volodymyr Kuleshov, Vitaly Shmatikov, and Alexander M Rush. 2023. Text Embeddings Reveal (Almost) As Much As Text. arXiv preprint arXiv:2310.06816 (2023).Google ScholarGoogle Scholar
  26. Tom J Pollard, Alistair EW Johnson, Jesse D Raffa, Leo A Celi, Roger G Mark, and Omar Badawi. 2018. The eICU Collaborative Research Database, a freely available multi-center database for critical care research. Scientific data 5, 1 (2018), 1--13.Google ScholarGoogle Scholar
  27. Zhi Qiao, Xian Wu, Shen Ge, and Wei Fan. 2019. MNN: Multimodal Attentional Neural Networks for Diagnosis Prediction. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19. International Joint Conferences on Artificial Intelligence Organization, 5937--5943. https://doi.org/10.24963/ijcai.2019/823Google ScholarGoogle ScholarCross RefCross Ref
  28. Junyuan Shang, Cao Xiao, Tengfei Ma, Hongyan Li, and Jimeng Sun. 2019. Gamenet: Graph augmented memory networks for recommending medication combination. In proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 1126--1133.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).Google ScholarGoogle Scholar
  30. Yuxin Wen, Neel Jain, John Kirchenbauer, Micah Goldblum, Jonas Geiping, and Tom Goldstein. 2023. Hard prompts made easy: Gradient-based discrete optimization for prompt tuning and discovery. arXiv preprint arXiv:2302.03668 (2023).Google ScholarGoogle Scholar
  31. Rui Wu, Zhaopeng Qiu, Jiacheng Jiang, Guilin Qi, and Xian Wu. 2022. Conditional Generation Net for Medication Recommendation. In Proceedings of the ACM Web Conference 2022 (Virtual Event, Lyon, France) (WWW '22). Association for Computing Machinery, New York, NY, USA, 935--945. https://doi.org/10.1145/3485447.3511936Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Rui Wu, Zhaopeng Qiu, Jiacheng Jiang, Guilin Qi, and Xian Wu. 2022. Conditional generation net for medication recommendation. In Proceedings of the ACM Web Conference 2022. 935--945.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Jie Xu, Benjamin S Glicksberg, Chang Su, Peter Walker, Jiang Bian, and Fei Wang. 2021. Federated learning for healthcare informatics. Journal of Healthcare Informatics Research 5 (2021), 1--19.Google ScholarGoogle ScholarCross RefCross Ref
  34. Hongxu Yin, Arun Mallya, Arash Vahdat, Jose M Alvarez, Jan Kautz, and Pavlo Molchanov. 2021. See through gradients: Image batch recovery via gradinversion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16337--16346.Google ScholarGoogle ScholarCross RefCross Ref
  35. Rui Zhang, Song Guo, Junxiao Wang, Xin Xie, and Dacheng Tao. 2022. A Survey on Gradient Inversion: Attacks, Defenses and Future Directions. arXiv preprint arXiv:2206.07284 (2022).Google ScholarGoogle Scholar
  36. Bo Zhao, Konda Reddy Mopuri, and Hakan Bilen. 2020. idlg: Improved deep leakage from gradients. arXiv preprint arXiv:2001.02610 (2020).Google ScholarGoogle Scholar
  37. Junyi Zhu and Matthew B Blaschko. 2021. R-GAP: Recursive Gradient Attack on Privacy. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  38. Ligeng Zhu, Zhijian Liu, and Song Han. 2019. Deep Leakage from Gradients. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.), Vol. 32.Google ScholarGoogle Scholar

Index Terms

  1. GraphLeak: Patient Record Leakage through Gradients with Knowledge Graph

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            WWW '24: Proceedings of the ACM on Web Conference 2024
            May 2024
            4826 pages
            ISBN:9798400701719
            DOI:10.1145/3589334

            Copyright © 2024 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 13 May 2024

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate1,899of8,196submissions,23%
          • Article Metrics

            • Downloads (Last 12 months)45
            • Downloads (Last 6 weeks)45

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader