智能科学与技术学报 ›› 2024, Vol. 6 ›› Issue (1): 64-75.doi: 10.11959/j.issn.2096-6652.202345

• 学术论文 • 上一篇    下一篇

基于人在回路的纵向联邦学习模型可解释性研究

李晓欢1,2, 郑钧柏1,2, 康嘉文3, 叶进2, 陈倩1,4()   

  1. 1.广西高校智能网联与场景化系统重点实验室(桂林电子科技大学信息与通信学院),广西 桂林 541004
    2.广西综合交通大数据研究院,广西 南宁 530025
    3.广东工业大学自动化学院,广东 广州 510006
    4.桂林电子科技大学建筑与交通工程学院,广西 桂林 541004
  • 收稿日期:2023-09-12 修回日期:2023-12-07 出版日期:2024-03-15 发布日期:2024-03-15
  • 通讯作者: 陈倩 E-mail:chenqian@mails.guet.edu.cn
  • 作者简介:李晓欢(1982- ),男,博士,桂林电子科技大学信息与通信学院教授、博士生导师,主要研究方向为智能计算、工业物联网和空天地网络等。
    郑钧柏(1997- ),男,桂林电子科技大学信息与通信学院硕士生,主要研究方向为可解释性机器学习和联邦学习等。
    康嘉文(1989- ),男,博士,广东工业大学自动化学院青年百人教授,主要研究方向为隐私保护、区块链和工业物联网等。
    叶进(1970- ),女,博士,广西综合交通大数据研究院教授、博士生导师,主要研究方向为网络协议设计和智能计算等。
    陈倩(1984- ),女,硕士,桂林电子科技大学信息与通信学院副教授,主要研究方向为物联网、交通大数据和智能计算等。
  • 基金资助:
    国家自然科学基金项目(U22A2054);广西科技重大专项(AA22068101)

Research on the explainability of vertical federated learning models based on human-in-the-loop

Xiaohuan LI1,2, Junbai ZHENG1,2, Jiawen KANG3, Jin YE2, Qian CHEN1,4()   

  1. 1.Guangxi University Key Laboratory of Intelligent Networking and Scenario System (Guilin University of Electronic Technology), Guilin 54004, China
    2.Guangxi Research Institute of Integrated Transportation Big Data, Nanning 530025, China
    3.School of Automation, Guangdong University of Technology, Guangzhou 510006, China
    4.School of Architecture and Transportation Engineering, Guilin University of Electronic Technology, Guilin 54004, China
  • Received:2023-09-12 Revised:2023-12-07 Online:2024-03-15 Published:2024-03-15
  • Contact: Qian CHEN E-mail:chenqian@mails.guet.edu.cn
  • Supported by:
    The National Natural Science Foundation of China(U22A2054);The Key Science and Technology Project of Guangxi(AA22068101)

摘要:

纵向联邦学习(vertical federated learning,VFL)常用于高风险场景中的跨领域数据共享,用户需要理解并信任模型决策以推动模型应用。现有研究主要关注VFL中可解释性与隐私之间的权衡,未充分满足用户对模型建立信任及调优的需求。为此,提出了一种基于人在回路(human-in-the-loop,HITL)的纵向联邦学习解释方法(explainable vertical federated learning based on human-in-the-loop,XVFL-HITL),通过构建分布式HITL结构将用户反馈纳入VFL的基于Shapley值的解释方法中,利用各参与方的知识校正训练数据来提高模型性能。进一步,考虑到隐私问题,基于Shapley值的可加性原理,将非当前参与方的特征贡献值整合为一个整体展示,从而有效保护了各参与方的特征隐私。实验结果表明,在基准数据上,XVFL-HITL的解释结果具有有效性,并保护了用户的特征隐私;同时,XVFL-HITL对比VFL-Random和直接使用SHAP的VFL-Shapley进行特征选择的方法,模型准确率分别提高了约14%和11%。

关键词: 纵向联邦学习, 可解释性, 人在回路, Shapley值

Abstract:

Vertical federated learning (VFL) is commonly used for cross-domain data sharing in high-risk scenarios. Users need to understand and trust model decisions to promote the application of models. Existing research primarily focuses on the trade-off between explainability and privacy within VFL, and fails to fully meet the needs of users for establishing trust and fine-tuning models. To address these issues, we proposed an explainable vertical federated learning method based on human-in-the-loop (XVFL-HITL), which incorporated user feedback into the VFL's Shapley value-based explainability approach through a distributed HITL structure, using the knowledge of all VFL participants to correct training data and enhance model performance. Furthermore, considering privacy concerns, this paper employed the additive principle of Shapley values to integrate the feature contribution values of all entities other than the target participant into an aggregated measure, which effectively protected the feature privacy of each participant. Experimental results indicated that on benchmark data, the explainability results of XVFL-HITL were effective and could well protect the feature privacy of user. Additionally, compared to VFL-Random and VFL-Shapley, the model accuracy of XVFL-HITL improved by approximately 14% and 11%, respectively.

Key words: vertical federated learning, explainability, human-in-the-loop, Shapley value

中图分类号: 

No Suggested Reading articles found!