SEDMA: Self-Distillation with Model Aggregation for Membership Privacy

Authors: Tsunato Nakai (Mitsubishi Electric Corporation), Ye Wang (Mitsubishi Electric Research Laboratories), Kota Yoshida (Ritsumeikan University), Takeshi Fujino (Ritsumeikan University)

Volume: 2024
Issue: 1
Pages: 494–508
DOI: https://doi.org/10.56553/popets-2024-0029

Download PDF

Abstract: Membership inference attacks (MIAs) are important measures to evaluate potential risks of privacy leakage from machine learning (ML) models. State-of-the-art MIA defenses have achieved favorable privacy-utility trade-offs using knowledge distillation on split training datasets. However, such defenses increase computational costs as a large number of the ML models must be trained on the split datasets. In this study, we proposed a new MIA defense, called SEDMA, based on self-distillation using model aggregation to mitigate the MIAs, inspired by the model parameter averaging as used in federated learning. The key idea of SEDMA is to split the training dataset into several parts and aggregate multiple ML models trained on each split for self-distillation. The intuitive explanation of SEDMA is that model aggregation prevents model over-fitting by smoothing information related to the training data among the multiple ML models and preserving the model utility, such as in federated learning. Through our experiments on major benchmark datasets (Purchase100, Texas100, and CIFAR100), we show that SEDMA outperforms state-of-the-art MIA defenses in terms of membership privacy (MIA accuracy), model accuracy, and computational costs. Specifically, SEDMA incurs at most approximately 3 - 5% model accuracy drop, while achieving the lowest MIA accuracy in state-of-the-art empirical MIA defenses. For computational costs, SEDMA takes significantly less processing time than a defense with the state-of-the-art privacy-utility trade-offs in previous defenses. SEDMA achieves both favorable privacy-utility trade-offs and low computational costs.

Keywords: membership inference attacks, privacy-preserving machine learning, self-distillation, model aggregation

Copyright in PoPETs articles are held by their authors. This article is published under a Creative Commons Attribution 4.0 license.