skip to main content
10.1145/3583780.3615019acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

PSLF: Defending Against Label Leakage in Split Learning

Published:21 October 2023Publication History

ABSTRACT

With increasing concern over data privacy, split learning has become a widely used distributed machine learning paradigm in practice, where two participants (namely the non-label party and the label party) own raw features and raw labels respectively, and jointly train a model. Although no raw data is communicated between the two parties during model training, several works have demonstrated that data privacy, especially label privacy, is still vulnerable in split learning, and have proposed several defense algorithms against label attacks. However, the theoretical guarantee on the privacy preservation of these algorithms is limited. In this work, we propose a novel Private Split Learning Framework (PSLF). In PSLF, the label party shares only the gradients computed by flipped labels with the non-label party, which improves privacy preservation on raw labels, and meanwhile, we further design an extra sub-model from true labels to improve prediction accuracy. We also design a Flipped Multi-Label Generation mechanism (FMLG) based on randomized response for the label party to generate flipped labels. FMLG is proven differentially private and the label party could make a trade-off between privacy and utility by setting the DP budget. In addition, we design an upsampling method to further protect the labels against some existing attacks. We have evaluated PSLF over real-world datasets to demonstrate its effectiveness in protecting label privacy and achieving promising prediction accuracy.

References

  1. Walaa Alnasser, Ghazaleh Beigi, Ahmadreza Mosallanezhad, and Huan Liu. 2022. PPSL: Privacy-Preserving Text Classification for Split Learning. In 4th International Conference on Data Intelligence and Security. 160--167.Google ScholarGoogle Scholar
  2. Arwa Alromih, John A. Clark, and Prosanta Gope. 2022. Privacy-Aware Split Learning Based Energy Theft Detection for Smart Grids. In Information and Communications Security - 24th International Conference. 281--300.Google ScholarGoogle Scholar
  3. Avazu. 2015. Avazu click-through rate prediction. https://www.kaggle.com/c/ avazu-ctr-prediction/data.Google ScholarGoogle Scholar
  4. Ahmad Ayad, Marian Frei, and Anke Schmeink. 2022. Efficient and Private ECG Classification on the Edge Using a Modified Split Learning Mechanism. In 10th IEEE International Conference on Healthcare Informatics. 1--6.Google ScholarGoogle ScholarCross RefCross Ref
  5. Yuanqin Cai and Tongquan Wei. 2022. Efficient Split Learning with Non-iid Data. In 23rd IEEE International Conference on Mobile Data Management. 128--136.Google ScholarGoogle Scholar
  6. Elie Chedemail, Basile de Loynes, Fabien Navarro, and Baptiste Olivier. 2022. Large Graph Signal Denoising With Application to Differential Privacy. IEEE Transactions on Signal and Information Processing over Networks, Vol. 8 (2022), 788--798.Google ScholarGoogle ScholarCross RefCross Ref
  7. Criteo. 2014. Criteo display advertising challenge. https://www.kaggle.com/c/ criteo-display-ad-challenge/data.Google ScholarGoogle Scholar
  8. Xiaofeng Ding, Cui Wang, Kim-Kwang Raymond Choo, and Hai Jin. 2021. A Novel Privacy Preserving Framework for Large Scale Graph Data Publishing. IEEE Trans. Knowl. Data Eng., Vol. 33, 2 (2021), 331--343.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Qiang Duan, Shijing Hu, Ruijun Deng, and Zhihui Lu. 2022. Combined Federated and Split Learning in Edge Computing for Ubiquitous Intelligence in Internet of Things: State-of-the-Art and Future Directions. Sensors, Vol. 22, 16 (2022), 5983.Google ScholarGoogle Scholar
  10. Cynthia Dwork. 2006. Differential Privacy. In Automata, Languages and Programming, 33rd International Colloquium, Vol. 4052. 1--12.Google ScholarGoogle Scholar
  11. Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam D. Smith. 2016. Calibrating Noise to Sensitivity in Private Data Analysis. Journal of Privacy and Confidentiality, Vol. 7, 3 (2016), 17--51.Google ScholarGoogle ScholarCross RefCross Ref
  12. Ege Erdogan, Alptekin Kü pcc ü, and A. Ercü ment cC icc ek. 2022a. SplitGuard: Detecting and Mitigating Training-Hijacking Attacks in Split Learning. In Proceedings of the 21st Workshop on Privacy in the Electronic Society. 125--137.Google ScholarGoogle Scholar
  13. Ege Erdogan, Alptekin Kü pcc ü, and A. Ercü ment cC icc ek. 2022b. UnSplit: Data-Oblivious Model Inversion, Model Stealing, and Label Inference Attacks against Split Learning. In Proceedings of the 21st Workshop on Privacy in the Electronic Society. 115--124.Google ScholarGoogle Scholar
  14. Hossein Esfandiari, Vahab S. Mirrokni, Umar Syed, and Sergei Vassilvitskii. 2022. Label differential privacy via clustering. In International Conference on Artificial Intelligence and Statistics, Vol. 151. 7055--7075.Google ScholarGoogle Scholar
  15. Alexandre V. Evfimievski, Johannes Gehrke, and Ramakrishnan Srikant. 2003. Limiting privacy breaches in privacy preserving data mining. In Proceedings of the Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. 211--222.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Badih Ghazi, Noah Golowich, Ravi Kumar, Pasin Manurangsi, and Chiyuan Zhang. 2021. Deep Learning with Label Differential Privacy. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021. 27131--27145.Google ScholarGoogle Scholar
  17. Yoo Jeong Ha, Minjae Yoo, Gusang Lee, Soyi Jung, Sae Won Choi, Joongheon Kim, and Seehwan Yoo. 2021. Spatio-Temporal Split Learning for Privacy-Preserving Medical Platforms: Case Studies With COVID-19 CT, X-Ray, and Cholesterol Data. IEEE Access, Vol. 9 (2021), 121046--121059.Google ScholarGoogle ScholarCross RefCross Ref
  18. Dongkun Hou, Jie Zhang, Jieming Ma, Xiaohui Zhu, and Ka Lok Man. 2021. Application of Differential Privacy for Collaborative Filtering Based Recommendation System: A Survey. In 12th International Symposium on Parallel Architectures, Algorithms and Programming. IEEE, 97--101.Google ScholarGoogle Scholar
  19. Sanjay Kariyappa and Moinuddin K. Qureshi. 2021. Gradient Inversion Attack: Leaking Private Labels in Two-Party Split Learning. CoRR, Vol. abs/2112.01299 (2021), 1--13.Google ScholarGoogle Scholar
  20. Yusuke Koda, Jihong Park, Mehdi Bennis, Koji Yamamoto, Takayuki Nishio, and Masahiro Morikura. 2019. One Pixel Image and RF Signal Based Split Learning for mmWave Received Power Prediction. In Proceedings of the 15th International Conference on emerging Networking EXperiments and Technologies, CoNEXT 2019, Companion Volume. 54--56.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jingtao Li, Adnan Siraj Rakin, Xing Chen, Zhezhi He, Deliang Fan, and Chaitali Chakrabarti. 2022a. ResSFL: A Resistance Transfer Framework for Defending Model Inversion Attack in Split Federated Learning. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10184--10192.Google ScholarGoogle Scholar
  22. Oscar Li, Jiankai Sun, Xin Yang, Weihao Gao, Hongyi Zhang, Junyuan Xie, Virginia Smith, and Chong Wang. 2022b. Label Leakage and Protection in Two-party Split Learning. In The Tenth International Conference on Learning Representations. 1--27.Google ScholarGoogle Scholar
  23. Xiaolan Liu, Yansha Deng, and Toktam Mahmoodi. 2022. A Novel Hybrid Split and Federated Learning Architecture in Wireless UAV Networks. In IEEE International Conference on Communications. 1--6.Google ScholarGoogle ScholarCross RefCross Ref
  24. Frank McSherry and Kunal Talwar. 2007. Mechanism Design via Differential Privacy. In 48th Annual IEEE Symposium on Foundations of Computer Science. 94--103.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Ahmed El Ouadrhiri and Ahmed Abdelhadi. 2022. Differential Privacy for Deep and Federated Learning: A Survey. IEEE Access, Vol. 10 (2022), 22359--22380.Google ScholarGoogle ScholarCross RefCross Ref
  26. Dario Pasquini, Giuseppe Ateniese, and Massimo Bernaschi. 2021. Unleashing the Tiger: Inference Attacks on Split Learning. In 2021 ACM SIGSAC Conference on Computer and Communications Security. 2113--2129.Google ScholarGoogle Scholar
  27. Eric Samikwa, Antonio Di Maio, and Torsten Braun. 2022. ARES: Adaptive Resource-Aware Split Learning for Internet of Things. Computer Networks, Vol. 218 (2022), 109380.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Jiankai Sun, Xin Yang, Yuanshun Yao, and Chong Wang. 2022a. Label Leakage and Protection from Forward Embedding in Vertical Federated Learning. CoRR, Vol. abs/2203.01451 (2022), 1--17.Google ScholarGoogle Scholar
  29. Jiankai Sun, Xin Yang, Yuanshun Yao, and Chong Wang. 2022b. Label Leakage and Protection from Forward Embedding in Vertical Federated Learning. CoRR, Vol. abs/2203.01451 (2022), 1--17.Google ScholarGoogle Scholar
  30. Differential Privacy Team. 2017. Learning with Privacy at Scale. https://machinelearning.apple.com/research/learning-with-privacy-at-scale.Google ScholarGoogle Scholar
  31. Chandra Thapa, Mahawaga Arachchige Pathum Chamikara, Seyit Camtepe, and Lichao Sun. 2022. SplitFed: When Federated Learning Meets Split Learning. In Thirty-Sixth AAAI Conference on Artificial Intelligence. 8485--8493.Google ScholarGoogle Scholar
  32. Brandon Tran, Jerry Li, and Aleksander Madry. 2018. Spectral Signatures in Backdoor Attacks. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2019. 8011--8021.Google ScholarGoogle Scholar
  33. Valeria Turina, Zongshun Zhang, Flavio Esposito, and Ibrahim Matta. 2021. Federated or Split? A Performance and Privacy Analysis of Hybrid Split and Federated Learning Architectures. In 14th IEEE International Conference on Cloud Computing. 250--260.Google ScholarGoogle ScholarCross RefCross Ref
  34. Praneeth Vepakomma, Otkrist Gupta, Tristan Swedish, and Ramesh Raskar. 2018. Split learning for health: Distributed deep learning without sharing raw patient data. CoRR, Vol. abs/1812.00564 (2018), 1--7.Google ScholarGoogle Scholar
  35. Praneeth Vepakomma, Abhishek Singh, Otkrist Gupta, and Ramesh Raskar. 2020. NoPeek: Information leakage reduction to share activations in distributed deep learning. In 20th International Conference on Data Mining Workshops, ICDM Workshops 2020, Sorrento, Italy, November 17--20, 2020. 933--942.Google ScholarGoogle ScholarCross RefCross Ref
  36. Dinah Waref and Mohammed Salem. 2022. Split Federated Learning for Emotion Detection. In 4th Novel Intelligent and Leading Emerging Sciences Conference. 112--115.Google ScholarGoogle Scholar
  37. S. L. Warner. 1965. Randomized response: a survey technique for eliminating evasive answer bias. Publications of the American Statistical Association, Vol. 60, 309 (1965), 63--69.Google ScholarGoogle ScholarCross RefCross Ref
  38. Danyang Xiao, Chengang Yang, and Weigang Wu. 2022. Mixing Activations and Labels in Distributed Training for Split Learning. IEEE Transactions on Parallel and Distributed Systems, Vol. 33, 11 (2022), 3165--3177.Google ScholarGoogle Scholar
  39. Qinqing Zheng, Jinshuo Dong, Qi Long, and Weijie J. Su. 2020. Sharp Composition Bounds for Gaussian Differential Privacy via Edgeworth Expansion. In Proceedings of the 37th International Conference on Machine Learning. 11420--11435.Google ScholarGoogle Scholar
  40. Wenxuan Zhou, Zhihao Qu, Yanchao Zhao, Bin Tang, and Baoliu Ye. 2022. An efficient split learning framework for recurrent neural network in mobile edge environment. In Proceedings of the Conference on Research in Adaptive and Convergent Systems. 131--138.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. PSLF: Defending Against Label Leakage in Split Learning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
      October 2023
      5508 pages
      ISBN:9798400701245
      DOI:10.1145/3583780

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 October 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    • Article Metrics

      • Downloads (Last 12 months)171
      • Downloads (Last 6 weeks)28

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader