research-article

Free Access

GraphLeak: Patient Record Leakage through Gradients with Knowledge Graph

Authors:
Xi Sheryl Zhang

Institute of Automation, Chinese Academy of Sciences, Beijing, China

Institute of Automation, Chinese Academy of Sciences, Beijing, China

0000-0002-7660-2564
View Profile

,
Weifan Guan

Institute of Automation, Chinese Academy of Sciences, Beijing, China

Institute of Automation, Chinese Academy of Sciences, Beijing, China

0009-0005-1360-967X
View Profile

,
Jiahao Lu

National University of Singapore, Singapore, Singapore

National University of Singapore, Singapore, Singapore

0000-0002-7569-6642
View Profile

,
Zhaopeng Qiu

Tencent, Beijing, China

Tencent, Beijing, China

0009-0006-7067-3763
View Profile

,
Jian Cheng

Institute of Automation, Chinese Academy of Sciences, Beijing, China

Institute of Automation, Chinese Academy of Sciences, Beijing, China

0000-0003-1289-2758
View Profile

,
Xian Wu

Jarvis Lab, Tencent, Beijing, China

Jarvis Lab, Tencent, Beijing, China

0000-0003-1118-9710
View Profile

,
Yefeng Zheng

Tencent, Shenzhen, China

Tencent, Shenzhen, China

0000-0003-2195-2847
View Profile

Authors Info & Claims

WWW '24: Proceedings of the ACM on Web Conference 2024May 2024Pages 4706–4716https://doi.org/10.1145/3589334.3648157

Published:13 May 2024Publication History

WWW '24: Proceedings of the ACM on Web Conference 2024

Pages 4706–4716

ABSTRACT

In real clinics, the medical data are scattered over multiple hospitals. Due to security and privacy concerns, it is almost impossible to gather all the data together and train a unified model. Therefore, multi-node machine learning systems are currently the mainstream form of model training in healthcare systems. Nevertheless, distributed training relies on the exchange of gradients, which has been proved under the risk of privacy leakage. That means malicious attackers can restore the user's sensitive data by utilizing the publicly shared gradients, which is a serious problem for extremely private data such as Electronic Healthcare Records (EHRs). The performance of the previous gradient attack method will drop rapidly when the batch size of training data increases, which makes it less threatening in practice. However, in this paper, we found in the medical domain, by leveraging prior knowledge like the medical knowledge graph, the leakage risk can be significantly amplified. In particular, we present GraphLeak, which incorporates the medical knowledge graph in gradient leakage attacks. GraphLeak can improve the restoration effect of gradient attacks even under large batches of data. We conduct experimental verification on electronic healthcare record datasets, including eICU and MIMIC-III. Our method has achieved state-of-the-art attack performance compared with previous works. Code is available at https://github.com/anonymous4ai/GraphLeak.

Supplemental Material

w4g0146.mp4

Supplemental video

mp4

58 MB

Download

References

Mislav Balunovic, Dimitar Dimitrov, Nikola Jovanovic, and Martin Vechev. 2022. Lamp: Extracting text from gradients with language model priors. Advances in Neural Information Processing Systems 35 (2022), 7641--7654.Google Scholar
John F Banzhaf III. 1964. Weighted voting doesn't work: A mathematical analysis. Rutgers L. Rev. 19 (1964), 317.Google Scholar
Alissa Brauneck, Louisa Schmalhorst, Mohammad Mahdi Kazemi Majdabadi, Mohammad Bakhtiari, Uwe Völker, Jan Baumbach, Linda Baumbach, and Gabriele Buchholtz. 2023. Federated machine learning, privacy-enhancing technologies, and data protection laws in medical research: Scoping review. Journal of Medical Internet Research 25 (2023), e41588.Google ScholarCross Ref
Sen Cui, Jian Liang, Weishen Pan, Kun Chen, Changshui Zhang, and Fei Wang. 2022. Collaboration equilibrium in federated learning. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 241--251.Google ScholarDigital Library
Jieren Deng, Yijue Wang, Ji Li, Chenghong Wang, Chao Shang, Hang Liu, Sanguthevar Rajasekaran, and Caiwen Ding. 2021. TAG: Gradient Attack on Transformer-based Language Models. In The 2021 Conference on Empirical Methods in Natural Language Processing.Google Scholar
Jonas Geiping, Hartmut Bauermeister, Hannah Dröge, and Michael Moeller. 2020. Inverting gradients-how easy is it to break privacy in federated learning? Advances in Neural Information Processing Systems 33 (2020), 16937--16947.Google Scholar
Michel Grabisch and Marc Roubens. 1999. An axiomatic approach to the concept of interaction among players in cooperative games. International Journal of game theory 28 (1999), 547--565.Google ScholarCross Ref
Samyak Gupta, Yangsibo Huang, Zexuan Zhong, Tianyu Gao, Kai Li, and Danqi Chen. 2022. Recovering Private Text in Federated Learning of Language Models. In Advances in Neural Information Processing Systems, Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho (Eds.).Google Scholar
Ali Hatamizadeh, Hongxu Yin, Pavlo Molchanov, Andriy Myronenko, Wenqi Li, Prerna Dogra, Andrew Feng, Mona Flores, Jan Kautz, Daguang Xu, et al . 2021. Towards Understanding the Risks of Gradient Inversion in Federated Learning. (2021).Google Scholar
Ali Hatamizadeh, Hongxu Yin, Pavlo Molchanov, Andriy Myronenko, Wenqi Li, Prerna Dogra, Andrew Feng, Mona G Flores, Jan Kautz, Daguang Xu, et al. 2023. Do gradient inversion attacks make federated learning unsafe? IEEE Transactions on Medical Imaging (2023).Google Scholar
Ali Hatamizadeh, Hongxu Yin, Holger R Roth, Wenqi Li, Jan Kautz, Daguang Xu, and Pavlo Molchanov. 2022. Gradvit: Gradient inversion of vision transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10021--10030.Google ScholarCross Ref
Yangsibo Huang, Samyak Gupta, Zhao Song, Kai Li, and Sanjeev Arora. 2021. Evaluating gradient inversion attacks and defenses in federated learning. Advances in Neural Information Processing Systems 34 (2021), 7232--7241.Google Scholar
Jinwoo Jeon, Kangwook Lee, Sewoong Oh, Jungseul Ok, et al . 2021. Gradient inversion with generative image prior. Advances in neural information processing systems 34 (2021), 29898--29908.Google Scholar
Xiaoqi Jiao, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Linlin Li, Fang Wang, and Qun Liu. 2019. Tinybert: Distilling bert for natural language understanding. arXiv preprint arXiv:1909.10351 (2019).Google Scholar
Xiao Jin, Pin-Yu Chen, Chia-Yi Hsu, Chia-Mu Yu, and Tianyi Chen. 2021. CAFE: Catastrophic data leakage in vertical federated learning. Advances in Neural Information Processing Systems 34 (2021), 994--1006.Google Scholar
Alistair EW Johnson, Tom J Pollard, Lu Shen, Li-wei H Lehman, Mengling Feng, Mohammad Ghassemi, Benjamin Moody, Peter Szolovits, Leo Anthony Celi, and Roger G Mark. 2016. MIMIC-III, a freely accessible critical care database. Scientific data 3, 1 (2016), 1--9.Google Scholar
Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL-HLT. 4171--4186.Google Scholar
Tian Li, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith. 2020. Federated learning: Challenges, methods, and future directions. IEEE signal processing magazine 37, 3 (2020), 50--60.Google Scholar
Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. 2020. Federated optimization in heterogeneous networks. Proceedings of Machine learning and systems 2 (2020), 429--450.Google Scholar
Fenglin Liu, Xian Wu, Shen Ge, Wei Fan, and Yuexian Zou. 2021. Exploring and Distilling Posterior and Prior Knowledge for Radiology Report Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 13753--13762.Google ScholarCross Ref
Fenglin Liu, Chenyu You, Xian Wu, Shen Ge, Sheng wang, and Xu Sun. 2021. Auto-Encoding Knowledge Graph for Unsupervised Medical Report Generation. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan (Eds.), Vol. 34. Curran Associates, Inc., 16266--16279. https://proceedings.neurips.cc/paper_files/paper/2021/file/876e1c59023b1a0e95808168e1a8ff89-Paper.pdfGoogle Scholar
Jiahao Lu, Xi Sheryl Zhang, Tianli Zhao, Xiangyu He, and Jian Cheng. 2022. APRIL: Finding the Achilles' Heel on Privacy for Vision Transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 10051--10060.Google ScholarCross Ref
Kailang Ma, Yu Sun, Jian Cui, Dawei Li, Zhenyu Guan, and Jianwei Liu. 2023. Instance-wise Batch Label Restoration via Gradients in Federated Learning. In The Eleventh International Conference on Learning Representations.Google Scholar
Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics. PMLR, 1273--1282.Google Scholar
John X Morris, Volodymyr Kuleshov, Vitaly Shmatikov, and Alexander M Rush. 2023. Text Embeddings Reveal (Almost) As Much As Text. arXiv preprint arXiv:2310.06816 (2023).Google Scholar
Tom J Pollard, Alistair EW Johnson, Jesse D Raffa, Leo A Celi, Roger G Mark, and Omar Badawi. 2018. The eICU Collaborative Research Database, a freely available multi-center database for critical care research. Scientific data 5, 1 (2018), 1--13.Google Scholar
Zhi Qiao, Xian Wu, Shen Ge, and Wei Fan. 2019. MNN: Multimodal Attentional Neural Networks for Diagnosis Prediction. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19. International Joint Conferences on Artificial Intelligence Organization, 5937--5943. https://doi.org/10.24963/ijcai.2019/823Google ScholarCross Ref
Junyuan Shang, Cao Xiao, Tengfei Ma, Hongyan Li, and Jimeng Sun. 2019. Gamenet: Graph augmented memory networks for recommending medication combination. In proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 1126--1133.Google ScholarDigital Library
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).Google Scholar
Yuxin Wen, Neel Jain, John Kirchenbauer, Micah Goldblum, Jonas Geiping, and Tom Goldstein. 2023. Hard prompts made easy: Gradient-based discrete optimization for prompt tuning and discovery. arXiv preprint arXiv:2302.03668 (2023).Google Scholar
Rui Wu, Zhaopeng Qiu, Jiacheng Jiang, Guilin Qi, and Xian Wu. 2022. Conditional Generation Net for Medication Recommendation. In Proceedings of the ACM Web Conference 2022 (Virtual Event, Lyon, France) (WWW '22). Association for Computing Machinery, New York, NY, USA, 935--945. https://doi.org/10.1145/3485447.3511936Google ScholarDigital Library
Rui Wu, Zhaopeng Qiu, Jiacheng Jiang, Guilin Qi, and Xian Wu. 2022. Conditional generation net for medication recommendation. In Proceedings of the ACM Web Conference 2022. 935--945.Google ScholarDigital Library
Jie Xu, Benjamin S Glicksberg, Chang Su, Peter Walker, Jiang Bian, and Fei Wang. 2021. Federated learning for healthcare informatics. Journal of Healthcare Informatics Research 5 (2021), 1--19.Google ScholarCross Ref
Hongxu Yin, Arun Mallya, Arash Vahdat, Jose M Alvarez, Jan Kautz, and Pavlo Molchanov. 2021. See through gradients: Image batch recovery via gradinversion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16337--16346.Google ScholarCross Ref
Rui Zhang, Song Guo, Junxiao Wang, Xin Xie, and Dacheng Tao. 2022. A Survey on Gradient Inversion: Attacks, Defenses and Future Directions. arXiv preprint arXiv:2206.07284 (2022).Google Scholar
Bo Zhao, Konda Reddy Mopuri, and Hakan Bilen. 2020. idlg: Improved deep leakage from gradients. arXiv preprint arXiv:2001.02610 (2020).Google Scholar
Junyi Zhu and Matthew B Blaschko. 2021. R-GAP: Recursive Gradient Attack on Privacy. In International Conference on Learning Representations.Google Scholar
Ligeng Zhu, Zhijian Liu, and Song Han. 2019. Deep Leakage from Gradients. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.), Vol. 32.Google Scholar

Index Terms

GraphLeak: Patient Record Leakage through Gradients with Knowledge Graph
1. Applied computing
  1. Life and medical sciences
    1. Health informatics
2. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence
  2. Machine learning
    1. Machine learning algorithms
      1. Regularization
    2. Machine learning approaches
      1. Neural networks

Recommendations

Implementing the lifelong personal health record in a regionalised health information system: The case of Lombardy, Italy
Abstract Background
The use of personal health records (PHRs) can help people make better health decisions and improves the quality of care by allowing access to and use of the information needed to communicate effectively with ...
Read More
Using electronic health record systems in diabetes care: emerging practices
IHI '10: Proceedings of the 1st ACM International Health Informatics Symposium

While there has been considerable attention devoted to the deployment of electronic health record (EHR) systems, there has been far less attention given to their appropriation for use in clinical encounters --- particularly in the context of complex, ...
Read More
Development and validation of a continuous measure of patient condition using the Electronic Medical Record

Graphical abstractDisplay Omitted New method to estimate patient condition during a hospital visit.Patient condition is computed by summing risks measured in each of 26 variables.Leverages data already in the EMR: vital signs, lab results, nursing ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '24: Proceedings of the ACM on Web Conference 2024
May 2024
4826 pages
ISBN:9798400701719
DOI:10.1145/3589334
General Chairs:
Tat-Seng Chua
National University of Singapore
,
Chong-Wah Ngo
Singapore Management University
,
Proceedings Chair:
Roy Ka-Wei Lee
Singapore University of Technology and Design
,
Program Chairs:
Ravi Kumar
Google
,
Hady W. Lauw
Singapore Management University
Copyright © 2024 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 May 2024
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
electronic health records
federated learning
gradient leakage
knowledge graph
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 45
  Total Downloads
- Downloads (Last 12 months)45
- Downloads (Last 6 weeks)45
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

GraphLeak: Patient Record Leakage through Gradients with Knowledge Graph

WWW '24: Proceedings of the ACM on Web Conference 2024

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Implementing the lifelong personal health record in a regionalised health information system: The case of Lombardy, Italy

Using electronic health record systems in diabetes care: emerging practices

Development and validation of a continuous measure of patient condition using the Electronic Medical Record

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

GraphLeak: Patient Record Leakage through Gradients with Knowledge Graph

WWW '24: Proceedings of the ACM on Web Conference 2024

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Implementing the lifelong personal health record in a regionalised health information system: The case of Lombardy, Italy

Using electronic health record systems in diabetes care: emerging practices

Development and validation of a continuous measure of patient condition using the Electronic Medical Record

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media