ABSTRACT
With increasing concern over data privacy, split learning has become a widely used distributed machine learning paradigm in practice, where two participants (namely the non-label party and the label party) own raw features and raw labels respectively, and jointly train a model. Although no raw data is communicated between the two parties during model training, several works have demonstrated that data privacy, especially label privacy, is still vulnerable in split learning, and have proposed several defense algorithms against label attacks. However, the theoretical guarantee on the privacy preservation of these algorithms is limited. In this work, we propose a novel Private Split Learning Framework (PSLF). In PSLF, the label party shares only the gradients computed by flipped labels with the non-label party, which improves privacy preservation on raw labels, and meanwhile, we further design an extra sub-model from true labels to improve prediction accuracy. We also design a Flipped Multi-Label Generation mechanism (FMLG) based on randomized response for the label party to generate flipped labels. FMLG is proven differentially private and the label party could make a trade-off between privacy and utility by setting the DP budget. In addition, we design an upsampling method to further protect the labels against some existing attacks. We have evaluated PSLF over real-world datasets to demonstrate its effectiveness in protecting label privacy and achieving promising prediction accuracy.
- Walaa Alnasser, Ghazaleh Beigi, Ahmadreza Mosallanezhad, and Huan Liu. 2022. PPSL: Privacy-Preserving Text Classification for Split Learning. In 4th International Conference on Data Intelligence and Security. 160--167.Google Scholar
- Arwa Alromih, John A. Clark, and Prosanta Gope. 2022. Privacy-Aware Split Learning Based Energy Theft Detection for Smart Grids. In Information and Communications Security - 24th International Conference. 281--300.Google Scholar
- Avazu. 2015. Avazu click-through rate prediction. https://www.kaggle.com/c/ avazu-ctr-prediction/data.Google Scholar
- Ahmad Ayad, Marian Frei, and Anke Schmeink. 2022. Efficient and Private ECG Classification on the Edge Using a Modified Split Learning Mechanism. In 10th IEEE International Conference on Healthcare Informatics. 1--6.Google ScholarCross Ref
- Yuanqin Cai and Tongquan Wei. 2022. Efficient Split Learning with Non-iid Data. In 23rd IEEE International Conference on Mobile Data Management. 128--136.Google Scholar
- Elie Chedemail, Basile de Loynes, Fabien Navarro, and Baptiste Olivier. 2022. Large Graph Signal Denoising With Application to Differential Privacy. IEEE Transactions on Signal and Information Processing over Networks, Vol. 8 (2022), 788--798.Google ScholarCross Ref
- Criteo. 2014. Criteo display advertising challenge. https://www.kaggle.com/c/ criteo-display-ad-challenge/data.Google Scholar
- Xiaofeng Ding, Cui Wang, Kim-Kwang Raymond Choo, and Hai Jin. 2021. A Novel Privacy Preserving Framework for Large Scale Graph Data Publishing. IEEE Trans. Knowl. Data Eng., Vol. 33, 2 (2021), 331--343.Google ScholarDigital Library
- Qiang Duan, Shijing Hu, Ruijun Deng, and Zhihui Lu. 2022. Combined Federated and Split Learning in Edge Computing for Ubiquitous Intelligence in Internet of Things: State-of-the-Art and Future Directions. Sensors, Vol. 22, 16 (2022), 5983.Google Scholar
- Cynthia Dwork. 2006. Differential Privacy. In Automata, Languages and Programming, 33rd International Colloquium, Vol. 4052. 1--12.Google Scholar
- Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam D. Smith. 2016. Calibrating Noise to Sensitivity in Private Data Analysis. Journal of Privacy and Confidentiality, Vol. 7, 3 (2016), 17--51.Google ScholarCross Ref
- Ege Erdogan, Alptekin Kü pcc ü, and A. Ercü ment cC icc ek. 2022a. SplitGuard: Detecting and Mitigating Training-Hijacking Attacks in Split Learning. In Proceedings of the 21st Workshop on Privacy in the Electronic Society. 125--137.Google Scholar
- Ege Erdogan, Alptekin Kü pcc ü, and A. Ercü ment cC icc ek. 2022b. UnSplit: Data-Oblivious Model Inversion, Model Stealing, and Label Inference Attacks against Split Learning. In Proceedings of the 21st Workshop on Privacy in the Electronic Society. 115--124.Google Scholar
- Hossein Esfandiari, Vahab S. Mirrokni, Umar Syed, and Sergei Vassilvitskii. 2022. Label differential privacy via clustering. In International Conference on Artificial Intelligence and Statistics, Vol. 151. 7055--7075.Google Scholar
- Alexandre V. Evfimievski, Johannes Gehrke, and Ramakrishnan Srikant. 2003. Limiting privacy breaches in privacy preserving data mining. In Proceedings of the Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. 211--222.Google ScholarDigital Library
- Badih Ghazi, Noah Golowich, Ravi Kumar, Pasin Manurangsi, and Chiyuan Zhang. 2021. Deep Learning with Label Differential Privacy. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021. 27131--27145.Google Scholar
- Yoo Jeong Ha, Minjae Yoo, Gusang Lee, Soyi Jung, Sae Won Choi, Joongheon Kim, and Seehwan Yoo. 2021. Spatio-Temporal Split Learning for Privacy-Preserving Medical Platforms: Case Studies With COVID-19 CT, X-Ray, and Cholesterol Data. IEEE Access, Vol. 9 (2021), 121046--121059.Google ScholarCross Ref
- Dongkun Hou, Jie Zhang, Jieming Ma, Xiaohui Zhu, and Ka Lok Man. 2021. Application of Differential Privacy for Collaborative Filtering Based Recommendation System: A Survey. In 12th International Symposium on Parallel Architectures, Algorithms and Programming. IEEE, 97--101.Google Scholar
- Sanjay Kariyappa and Moinuddin K. Qureshi. 2021. Gradient Inversion Attack: Leaking Private Labels in Two-Party Split Learning. CoRR, Vol. abs/2112.01299 (2021), 1--13.Google Scholar
- Yusuke Koda, Jihong Park, Mehdi Bennis, Koji Yamamoto, Takayuki Nishio, and Masahiro Morikura. 2019. One Pixel Image and RF Signal Based Split Learning for mmWave Received Power Prediction. In Proceedings of the 15th International Conference on emerging Networking EXperiments and Technologies, CoNEXT 2019, Companion Volume. 54--56.Google ScholarDigital Library
- Jingtao Li, Adnan Siraj Rakin, Xing Chen, Zhezhi He, Deliang Fan, and Chaitali Chakrabarti. 2022a. ResSFL: A Resistance Transfer Framework for Defending Model Inversion Attack in Split Federated Learning. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10184--10192.Google Scholar
- Oscar Li, Jiankai Sun, Xin Yang, Weihao Gao, Hongyi Zhang, Junyuan Xie, Virginia Smith, and Chong Wang. 2022b. Label Leakage and Protection in Two-party Split Learning. In The Tenth International Conference on Learning Representations. 1--27.Google Scholar
- Xiaolan Liu, Yansha Deng, and Toktam Mahmoodi. 2022. A Novel Hybrid Split and Federated Learning Architecture in Wireless UAV Networks. In IEEE International Conference on Communications. 1--6.Google ScholarCross Ref
- Frank McSherry and Kunal Talwar. 2007. Mechanism Design via Differential Privacy. In 48th Annual IEEE Symposium on Foundations of Computer Science. 94--103.Google ScholarDigital Library
- Ahmed El Ouadrhiri and Ahmed Abdelhadi. 2022. Differential Privacy for Deep and Federated Learning: A Survey. IEEE Access, Vol. 10 (2022), 22359--22380.Google ScholarCross Ref
- Dario Pasquini, Giuseppe Ateniese, and Massimo Bernaschi. 2021. Unleashing the Tiger: Inference Attacks on Split Learning. In 2021 ACM SIGSAC Conference on Computer and Communications Security. 2113--2129.Google Scholar
- Eric Samikwa, Antonio Di Maio, and Torsten Braun. 2022. ARES: Adaptive Resource-Aware Split Learning for Internet of Things. Computer Networks, Vol. 218 (2022), 109380.Google ScholarDigital Library
- Jiankai Sun, Xin Yang, Yuanshun Yao, and Chong Wang. 2022a. Label Leakage and Protection from Forward Embedding in Vertical Federated Learning. CoRR, Vol. abs/2203.01451 (2022), 1--17.Google Scholar
- Jiankai Sun, Xin Yang, Yuanshun Yao, and Chong Wang. 2022b. Label Leakage and Protection from Forward Embedding in Vertical Federated Learning. CoRR, Vol. abs/2203.01451 (2022), 1--17.Google Scholar
- Differential Privacy Team. 2017. Learning with Privacy at Scale. https://machinelearning.apple.com/research/learning-with-privacy-at-scale.Google Scholar
- Chandra Thapa, Mahawaga Arachchige Pathum Chamikara, Seyit Camtepe, and Lichao Sun. 2022. SplitFed: When Federated Learning Meets Split Learning. In Thirty-Sixth AAAI Conference on Artificial Intelligence. 8485--8493.Google Scholar
- Brandon Tran, Jerry Li, and Aleksander Madry. 2018. Spectral Signatures in Backdoor Attacks. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2019. 8011--8021.Google Scholar
- Valeria Turina, Zongshun Zhang, Flavio Esposito, and Ibrahim Matta. 2021. Federated or Split? A Performance and Privacy Analysis of Hybrid Split and Federated Learning Architectures. In 14th IEEE International Conference on Cloud Computing. 250--260.Google ScholarCross Ref
- Praneeth Vepakomma, Otkrist Gupta, Tristan Swedish, and Ramesh Raskar. 2018. Split learning for health: Distributed deep learning without sharing raw patient data. CoRR, Vol. abs/1812.00564 (2018), 1--7.Google Scholar
- Praneeth Vepakomma, Abhishek Singh, Otkrist Gupta, and Ramesh Raskar. 2020. NoPeek: Information leakage reduction to share activations in distributed deep learning. In 20th International Conference on Data Mining Workshops, ICDM Workshops 2020, Sorrento, Italy, November 17--20, 2020. 933--942.Google ScholarCross Ref
- Dinah Waref and Mohammed Salem. 2022. Split Federated Learning for Emotion Detection. In 4th Novel Intelligent and Leading Emerging Sciences Conference. 112--115.Google Scholar
- S. L. Warner. 1965. Randomized response: a survey technique for eliminating evasive answer bias. Publications of the American Statistical Association, Vol. 60, 309 (1965), 63--69.Google ScholarCross Ref
- Danyang Xiao, Chengang Yang, and Weigang Wu. 2022. Mixing Activations and Labels in Distributed Training for Split Learning. IEEE Transactions on Parallel and Distributed Systems, Vol. 33, 11 (2022), 3165--3177.Google Scholar
- Qinqing Zheng, Jinshuo Dong, Qi Long, and Weijie J. Su. 2020. Sharp Composition Bounds for Gaussian Differential Privacy via Edgeworth Expansion. In Proceedings of the 37th International Conference on Machine Learning. 11420--11435.Google Scholar
- Wenxuan Zhou, Zhihao Qu, Yanchao Zhao, Bin Tang, and Baoliu Ye. 2022. An efficient split learning framework for recurrent neural network in mobile edge environment. In Proceedings of the Conference on Research in Adaptive and Convergent Systems. 131--138.Google ScholarDigital Library
Index Terms
- PSLF: Defending Against Label Leakage in Split Learning
Recommendations
UnSplit: Data-Oblivious Model Inversion, Model Stealing, and Label Inference Attacks against Split Learning
WPES'22: Proceedings of the 21st Workshop on Privacy in the Electronic SocietyTraining deep neural networks often forces users to work in a distributed or outsourced setting, accompanied with privacy concerns. Split learning aims to address this concern by distributing the model among a client and a server. The scheme supposedly ...
Transductive Multilabel Learning via Label Set Propagation
The problem of multilabel classification has attracted great interest in the last decade, where each instance can be assigned with a set of multiple class labels simultaneously. It has a wide variety of real-world applications, e.g., automatic image ...
A More Secure Split: Enhancing the Security of Privacy-Preserving Split Learning
Secure IT SystemsAbstractSplit learning (SL) is a new collaborative learning technique that allows participants, e.g. a client and a server, to train machine learning models without the client sharing raw data. In this setting, the client initially applies its part of the ...
Comments