ABSTRACT
In real clinics, the medical data are scattered over multiple hospitals. Due to security and privacy concerns, it is almost impossible to gather all the data together and train a unified model. Therefore, multi-node machine learning systems are currently the mainstream form of model training in healthcare systems. Nevertheless, distributed training relies on the exchange of gradients, which has been proved under the risk of privacy leakage. That means malicious attackers can restore the user's sensitive data by utilizing the publicly shared gradients, which is a serious problem for extremely private data such as Electronic Healthcare Records (EHRs). The performance of the previous gradient attack method will drop rapidly when the batch size of training data increases, which makes it less threatening in practice. However, in this paper, we found in the medical domain, by leveraging prior knowledge like the medical knowledge graph, the leakage risk can be significantly amplified. In particular, we present GraphLeak, which incorporates the medical knowledge graph in gradient leakage attacks. GraphLeak can improve the restoration effect of gradient attacks even under large batches of data. We conduct experimental verification on electronic healthcare record datasets, including eICU and MIMIC-III. Our method has achieved state-of-the-art attack performance compared with previous works. Code is available at https://github.com/anonymous4ai/GraphLeak.
Supplemental Material
- Mislav Balunovic, Dimitar Dimitrov, Nikola Jovanovic, and Martin Vechev. 2022. Lamp: Extracting text from gradients with language model priors. Advances in Neural Information Processing Systems 35 (2022), 7641--7654.Google Scholar
- John F Banzhaf III. 1964. Weighted voting doesn't work: A mathematical analysis. Rutgers L. Rev. 19 (1964), 317.Google Scholar
- Alissa Brauneck, Louisa Schmalhorst, Mohammad Mahdi Kazemi Majdabadi, Mohammad Bakhtiari, Uwe Völker, Jan Baumbach, Linda Baumbach, and Gabriele Buchholtz. 2023. Federated machine learning, privacy-enhancing technologies, and data protection laws in medical research: Scoping review. Journal of Medical Internet Research 25 (2023), e41588.Google ScholarCross Ref
- Sen Cui, Jian Liang, Weishen Pan, Kun Chen, Changshui Zhang, and Fei Wang. 2022. Collaboration equilibrium in federated learning. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 241--251.Google ScholarDigital Library
- Jieren Deng, Yijue Wang, Ji Li, Chenghong Wang, Chao Shang, Hang Liu, Sanguthevar Rajasekaran, and Caiwen Ding. 2021. TAG: Gradient Attack on Transformer-based Language Models. In The 2021 Conference on Empirical Methods in Natural Language Processing.Google Scholar
- Jonas Geiping, Hartmut Bauermeister, Hannah Dröge, and Michael Moeller. 2020. Inverting gradients-how easy is it to break privacy in federated learning? Advances in Neural Information Processing Systems 33 (2020), 16937--16947.Google Scholar
- Michel Grabisch and Marc Roubens. 1999. An axiomatic approach to the concept of interaction among players in cooperative games. International Journal of game theory 28 (1999), 547--565.Google ScholarCross Ref
- Samyak Gupta, Yangsibo Huang, Zexuan Zhong, Tianyu Gao, Kai Li, and Danqi Chen. 2022. Recovering Private Text in Federated Learning of Language Models. In Advances in Neural Information Processing Systems, Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho (Eds.).Google Scholar
- Ali Hatamizadeh, Hongxu Yin, Pavlo Molchanov, Andriy Myronenko, Wenqi Li, Prerna Dogra, Andrew Feng, Mona Flores, Jan Kautz, Daguang Xu, et al . 2021. Towards Understanding the Risks of Gradient Inversion in Federated Learning. (2021).Google Scholar
- Ali Hatamizadeh, Hongxu Yin, Pavlo Molchanov, Andriy Myronenko, Wenqi Li, Prerna Dogra, Andrew Feng, Mona G Flores, Jan Kautz, Daguang Xu, et al. 2023. Do gradient inversion attacks make federated learning unsafe? IEEE Transactions on Medical Imaging (2023).Google Scholar
- Ali Hatamizadeh, Hongxu Yin, Holger R Roth, Wenqi Li, Jan Kautz, Daguang Xu, and Pavlo Molchanov. 2022. Gradvit: Gradient inversion of vision transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10021--10030.Google ScholarCross Ref
- Yangsibo Huang, Samyak Gupta, Zhao Song, Kai Li, and Sanjeev Arora. 2021. Evaluating gradient inversion attacks and defenses in federated learning. Advances in Neural Information Processing Systems 34 (2021), 7232--7241.Google Scholar
- Jinwoo Jeon, Kangwook Lee, Sewoong Oh, Jungseul Ok, et al . 2021. Gradient inversion with generative image prior. Advances in neural information processing systems 34 (2021), 29898--29908.Google Scholar
- Xiaoqi Jiao, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Linlin Li, Fang Wang, and Qun Liu. 2019. Tinybert: Distilling bert for natural language understanding. arXiv preprint arXiv:1909.10351 (2019).Google Scholar
- Xiao Jin, Pin-Yu Chen, Chia-Yi Hsu, Chia-Mu Yu, and Tianyi Chen. 2021. CAFE: Catastrophic data leakage in vertical federated learning. Advances in Neural Information Processing Systems 34 (2021), 994--1006.Google Scholar
- Alistair EW Johnson, Tom J Pollard, Lu Shen, Li-wei H Lehman, Mengling Feng, Mohammad Ghassemi, Benjamin Moody, Peter Szolovits, Leo Anthony Celi, and Roger G Mark. 2016. MIMIC-III, a freely accessible critical care database. Scientific data 3, 1 (2016), 1--9.Google Scholar
- Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL-HLT. 4171--4186.Google Scholar
- Tian Li, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith. 2020. Federated learning: Challenges, methods, and future directions. IEEE signal processing magazine 37, 3 (2020), 50--60.Google Scholar
- Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. 2020. Federated optimization in heterogeneous networks. Proceedings of Machine learning and systems 2 (2020), 429--450.Google Scholar
- Fenglin Liu, Xian Wu, Shen Ge, Wei Fan, and Yuexian Zou. 2021. Exploring and Distilling Posterior and Prior Knowledge for Radiology Report Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 13753--13762.Google ScholarCross Ref
- Fenglin Liu, Chenyu You, Xian Wu, Shen Ge, Sheng wang, and Xu Sun. 2021. Auto-Encoding Knowledge Graph for Unsupervised Medical Report Generation. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan (Eds.), Vol. 34. Curran Associates, Inc., 16266--16279. https://proceedings.neurips.cc/paper_files/paper/2021/file/876e1c59023b1a0e95808168e1a8ff89-Paper.pdfGoogle Scholar
- Jiahao Lu, Xi Sheryl Zhang, Tianli Zhao, Xiangyu He, and Jian Cheng. 2022. APRIL: Finding the Achilles' Heel on Privacy for Vision Transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 10051--10060.Google ScholarCross Ref
- Kailang Ma, Yu Sun, Jian Cui, Dawei Li, Zhenyu Guan, and Jianwei Liu. 2023. Instance-wise Batch Label Restoration via Gradients in Federated Learning. In The Eleventh International Conference on Learning Representations.Google Scholar
- Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics. PMLR, 1273--1282.Google Scholar
- John X Morris, Volodymyr Kuleshov, Vitaly Shmatikov, and Alexander M Rush. 2023. Text Embeddings Reveal (Almost) As Much As Text. arXiv preprint arXiv:2310.06816 (2023).Google Scholar
- Tom J Pollard, Alistair EW Johnson, Jesse D Raffa, Leo A Celi, Roger G Mark, and Omar Badawi. 2018. The eICU Collaborative Research Database, a freely available multi-center database for critical care research. Scientific data 5, 1 (2018), 1--13.Google Scholar
- Zhi Qiao, Xian Wu, Shen Ge, and Wei Fan. 2019. MNN: Multimodal Attentional Neural Networks for Diagnosis Prediction. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19. International Joint Conferences on Artificial Intelligence Organization, 5937--5943. https://doi.org/10.24963/ijcai.2019/823Google ScholarCross Ref
- Junyuan Shang, Cao Xiao, Tengfei Ma, Hongyan Li, and Jimeng Sun. 2019. Gamenet: Graph augmented memory networks for recommending medication combination. In proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 1126--1133.Google ScholarDigital Library
- Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).Google Scholar
- Yuxin Wen, Neel Jain, John Kirchenbauer, Micah Goldblum, Jonas Geiping, and Tom Goldstein. 2023. Hard prompts made easy: Gradient-based discrete optimization for prompt tuning and discovery. arXiv preprint arXiv:2302.03668 (2023).Google Scholar
- Rui Wu, Zhaopeng Qiu, Jiacheng Jiang, Guilin Qi, and Xian Wu. 2022. Conditional Generation Net for Medication Recommendation. In Proceedings of the ACM Web Conference 2022 (Virtual Event, Lyon, France) (WWW '22). Association for Computing Machinery, New York, NY, USA, 935--945. https://doi.org/10.1145/3485447.3511936Google ScholarDigital Library
- Rui Wu, Zhaopeng Qiu, Jiacheng Jiang, Guilin Qi, and Xian Wu. 2022. Conditional generation net for medication recommendation. In Proceedings of the ACM Web Conference 2022. 935--945.Google ScholarDigital Library
- Jie Xu, Benjamin S Glicksberg, Chang Su, Peter Walker, Jiang Bian, and Fei Wang. 2021. Federated learning for healthcare informatics. Journal of Healthcare Informatics Research 5 (2021), 1--19.Google ScholarCross Ref
- Hongxu Yin, Arun Mallya, Arash Vahdat, Jose M Alvarez, Jan Kautz, and Pavlo Molchanov. 2021. See through gradients: Image batch recovery via gradinversion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16337--16346.Google ScholarCross Ref
- Rui Zhang, Song Guo, Junxiao Wang, Xin Xie, and Dacheng Tao. 2022. A Survey on Gradient Inversion: Attacks, Defenses and Future Directions. arXiv preprint arXiv:2206.07284 (2022).Google Scholar
- Bo Zhao, Konda Reddy Mopuri, and Hakan Bilen. 2020. idlg: Improved deep leakage from gradients. arXiv preprint arXiv:2001.02610 (2020).Google Scholar
- Junyi Zhu and Matthew B Blaschko. 2021. R-GAP: Recursive Gradient Attack on Privacy. In International Conference on Learning Representations.Google Scholar
- Ligeng Zhu, Zhijian Liu, and Song Han. 2019. Deep Leakage from Gradients. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.), Vol. 32.Google Scholar
Index Terms
- GraphLeak: Patient Record Leakage through Gradients with Knowledge Graph
Recommendations
Implementing the lifelong personal health record in a regionalised health information system: The case of Lombardy, Italy
Abstract BackgroundThe use of personal health records (PHRs) can help people make better health decisions and improves the quality of care by allowing access to and use of the information needed to communicate effectively with ...
Using electronic health record systems in diabetes care: emerging practices
IHI '10: Proceedings of the 1st ACM International Health Informatics SymposiumWhile there has been considerable attention devoted to the deployment of electronic health record (EHR) systems, there has been far less attention given to their appropriation for use in clinical encounters --- particularly in the context of complex, ...
Development and validation of a continuous measure of patient condition using the Electronic Medical Record
Graphical abstractDisplay Omitted New method to estimate patient condition during a hospital visit.Patient condition is computed by summing risks measured in each of 26 variables.Leverages data already in the EMR: vital signs, lab results, nursing ...
Comments