ABSTRACT
Graph Neural Networks (GNNs) have demonstrated superior performance in learning node representations for various graph inference tasks. However, learning over graph data can raise privacy concerns when nodes represent people or human-related variables that involve sensitive or personal information. In this paper, we study the problem of node data privacy, where graph nodes (e.g., social network users) have potentially sensitive data that is kept private, but they could be beneficial for a central server for training a GNN over the graph. To address this problem, we propose a privacy-preserving, architecture-agnostic GNN learning framework with formal privacy guarantees based on Local Differential Privacy (LDP). Specifically, we develop a locally private mechanism to perturb and compress node features, which the server can efficiently collect to approximate the GNN's neighborhood aggregation step. Furthermore, to improve the accuracy of the estimation, we prepend to the GNN a denoising layer, called KProp, which is based on the multi-hop aggregation of node features. Finally, we propose a robust algorithm for learning with privatized noisy labels, where we again benefit from KProp's denoising capability to increase the accuracy of label inference for node classification. Extensive experiments conducted over real-world datasets demonstrate that our method can maintain a satisfying level of accuracy with low privacy loss.
Supplemental Material
- 2021. Stealing Links from Graph Neural Networks. In 30th USENIX Security Symposium (USENIX Security 21). USENIX Association, Vancouver, B.C. https://www.usenix.org/conference/usenixsecurity21/presentation/heGoogle Scholar
- Sami Abu-El-Haija, Bryan Perozzi, Amol Kapoor, Nazanin Alipourfard, Kristina Lerman, Hrayr Harutyunyan, Greg Ver Steeg, and Aram Galstyan. 2019. M ix H op: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing (Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, Long Beach, California, USA, 21--29.Google Scholar
- Jayadev Acharya, Ziteng Sun, and Huanyu Zhang. 2018. Communication efficient, sample optimal, linear time locally private discrete distribution estimation. arXiv preprint arXiv:1802.04705 (2018).Google Scholar
- Jayadev Acharya, Ziteng Sun, and Huanyu Zhang. 2019. Hadamard response: Estimating distributions privately, efficiently, and with little communication. In The 22nd International Conference on Artificial Intelligence and Statistics. PMLR, 1120--1129.Google Scholar
- Borja Balle and Yu-Xiang Wang. 2018. Improving the Gaussian Mechanism for Differential Privacy: Analytical Calibration and Optimal Denoising. In International Conference on Machine Learning. 394--403.Google Scholar
- Raef Bassily, Kobbi Nissim, Uri Stemmer, and Abhradeep Thakurta. 2017. Practical locally private heavy hitters. arXiv preprint arXiv:1707.04982 (2017).Google Scholar
- Raef Bassily and Adam Smith. 2015. Local, private, efficient protocols for succinct histograms. In Proceedings of the forty-seventh annual ACM symposium on Theory of computing. 127--135.Google ScholarDigital Library
- Mark Bun, Jelani Nelson, and Uri Stemmer. 2019. Heavy hitters and the structure of local privacy. ACM Transactions on Algorithms (TALG), Vol. 15, 4 (2019), 1--40.Google ScholarDigital Library
- Zhengdao Chen, Xiang Li, and Joan Bruna. 2017. Supervised community detection with line graph neural networks. arXiv preprint arXiv:1705.08415 (2017).Google Scholar
- Graham Cormode, Tejas Kulkarni, and Divesh Srivastava. 2018. Marginal release under local differential privacy. In Proceedings of the 2018 International Conference on Management of Data. 131--146.Google ScholarDigital Library
- Bolin Ding, Janardhan Kulkarni, and Sergey Yekhanin. 2017. Collecting telemetry data privately. In Advances in Neural Information Processing Systems. 3571--3580.Google Scholar
- John C Duchi, Michael I Jordan, and Martin J Wainwright. 2018. Minimax optimal procedures for locally private estimation. J. Amer. Statist. Assoc., Vol. 113, 521 (2018), 182--201.Google ScholarCross Ref
- Vasisht Duddu, Antoine Boutet, and Virat Shejwalkar. 2020. Quantifying Privacy Leakage in Graph Embedding. In Mobiquitous 2020--17th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services. 1--11.Google Scholar
- David K Duvenaud, Dougal Maclaurin, Jorge Iparraguirre, Rafael Bombarell, Timothy Hirzel, Al á n Aspuru-Guzik, and Ryan P Adams. 2015. Convolutional networks on graphs for learning molecular fingerprints. In Advances in neural information processing systems. 2224--2232.Google Scholar
- Cynthia Dwork, Aaron Roth, et al. 2014. The algorithmic foundations of differential privacy. Foundations and Trends ® in Theoretical Computer Science, Vol. 9, 3--4 (2014), 211--407.Google Scholar
- Ú lfar Erlingsson, Vasyl Pihur, and Aleksandra Korolova. 2014. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC conference on computer and communications security. 1054--1067.Google Scholar
- Marco Gaboardi and Ryan Rogers. 2018. Local private hypothesis testing: Chi-square tests. In International Conference on Machine Learning. PMLR, 1626--1635.Google Scholar
- Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017a. Inductive representation learning on large graphs. In Advances in neural information processing systems. 1024--1034.Google Scholar
- William L Hamilton, Rex Ying, and Jure Leskovec. 2017b. Representation learning on graphs: Methods and applications. arXiv preprint arXiv:1709.05584 (2017).Google Scholar
- Madhav Jha and Sofya Raskhodnikova. 2013. Testing and reconstruction of Lipschitz functions with applications to data privacy. SIAM J. Comput., Vol. 42, 2 (2013), 700--731.Google ScholarDigital Library
- Meng Jiang, Taeho Jung, Ryan Karl, and Tong Zhao. 2020. Federated Dynamic GNN with Secure Aggregation. arXiv preprint arXiv:2009.07351 (2020).Google Scholar
- Peter Kairouz, Keith Bonawitz, and Daniel Ramage. 2016. Discrete distribution estimation under local privacy. In International Conference on Machine Learning. PMLR, 2436--2444.Google Scholar
- Peter Kairouz, H Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Keith Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, et al. 2019. Advances and open problems in federated learning. arXiv preprint arXiv:1912.04977 (2019).Google Scholar
- Shiva Prasad Kasiviswanathan, Homin K Lee, Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. 2011. What can we learn privately? SIAM J. Comput., Vol. 40, 3 (2011), 793--826.Google ScholarDigital Library
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations (ICLR) .Google Scholar
- Günter Klambauer, Thomas Unterthiner, Andreas Mayr, and Sepp Hochreiter. 2017. Self-normalizing neural networks. In Advances in neural information processing systems. 971--980.Google Scholar
- Johannes Klicpera, Stefan Wei ß enberger, and Stephan G ü nnemann. 2019. Diffusion improves graph learning. In Advances in Neural Information Processing Systems. 13354--13366.Google Scholar
- Kaiyang Li, Guangchun Luo, Yang Ye, Wei Li, Shihao Ji, and Zhipeng Cai. 2020. Adversarial Privacy Preserving Graph Embedding against Inference Attack. arXiv preprint arXiv:2008.13072 (2020).Google Scholar
- Qimai Li, Zhichao Han, and Xiao-Ming Wu. 2018. Deeper insights into graph convolutional networks for semi-supervised learning. arXiv preprint arXiv:1801.07606 (2018).Google Scholar
- Yayong Li, Ling Chen, et al. 2021. Unified Robust Training for Graph NeuralNetworks against Label Noise. arXiv preprint arXiv:2103.03414 (2021).Google Scholar
- Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard Zemel. 2015. Gated graph sequence neural networks. arXiv preprint arXiv:1511.05493 (2015).Google Scholar
- Peiyuan Liao, Han Zhao, Keyulu Xu, Tommi Jaakkola, Geoffrey Gordon, Stefanie Jegelka, and Ruslan Salakhutdinov. 2020. Graph Adversarial Networks: Protecting Information against Adversarial Attacks. arXiv preprint arXiv:2009.13504 (2020).Google Scholar
- Miller McPherson, Lynn Smith-Lovin, and James M Cook. 2001. Birds of a feather: Homophily in social networks. Annual review of sociology, Vol. 27, 1 (2001), 415--444.Google Scholar
- G. Mei, Z. Guo, S. Liu, and L. Pan. 2019. SGNN: A Graph Neural Network Based Federated Learning Approach by Hiding Structure. In 2019 IEEE International Conference on Big Data (Big Data). IEEE Computer Society, Los Alamitos, CA, USA, 2560--2568.Google Scholar
- Christopher Morris, Martin Ritzert, Matthias Fey, William L Hamilton, Jan Eric Lenssen, Gaurav Rattan, and Martin Grohe. 2019. Weisfeiler and leman go neural: Higher-order graph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 4602--4609.Google ScholarDigital Library
- Kobbi Nissim and Uri Stemmer. 2018. Clustering algorithms for the centralized and local models. In Algorithmic Learning Theory. PMLR, 619--653.Google Scholar
- Hoang NT, Choong Jun Jin, and Tsuyoshi Murata. 2019. Learning graph neural networks with noisy labels. arXiv preprint arXiv:1905.01591 (2019).Google Scholar
- Giorgio Patrini, Alessandro Rozza, Aditya Krishna Menon, Richard Nock, and Lizhen Qu. 2017. Making deep neural networks robust to label noise: A loss correction approach. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1944--1952.Google ScholarCross Ref
- Zhan Qin, Yin Yang, Ting Yu, Issa Khalil, Xiaokui Xiao, and Kui Ren. 2016. Heavy hitter estimation over set-valued data with local differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. 192--203.Google ScholarDigital Library
- Sungmin Rhee, Seokjun Seo, and Sun Kim. 2017. Hybrid approach of relation network and localized graph convolutional filtering for breast cancer subtype classification. arXiv preprint arXiv:1711.05859 (2017).Google Scholar
- Benedek Rozemberczki, Carl Allen, and Rik Sarkar. 2019. Multi-scale Attributed Node Embedding. arXiv preprint arXiv:1909.13021 (2019).Google Scholar
- Benedek Rozemberczki and Rik Sarkar. 2020. Characteristic Functions on Graphs: Birds of a Feather, from Statistical Descriptors to Parametric Models. In Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM '20). ACM.Google ScholarDigital Library
- Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. 2008. The graph neural network model. IEEE Transactions on Neural Networks, Vol. 20, 1 (2008), 61--80.Google ScholarDigital Library
- Hwanjun Song, Minseok Kim, Dongmin Park, and Jae-Gil Lee. 2020. Learning from noisy labels with deep neural networks: A survey. arXiv preprint arXiv:2007.08199 (2020).Google Scholar
- Abhradeep Guha Thakurta, Andrew H Vyrros, Umesh S Vaishampayan, Gaurav Kapoor, Julien Freudiger, Vivek Rangarajan Sridhar, and Doug Davidson. 2017. Learning new words. US Patent 9,594,741.Google Scholar
- Petar Veli v c kovi ć, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).Google Scholar
- Hongwei Wang and Jure Leskovec. 2020. Unifying graph convolutional neural networks and label propagation. arXiv preprint arXiv:2002.06755 (2020).Google Scholar
- Ning Wang, Xiaokui Xiao, Yin Yang, Ta Duy Hoang, Hyejin Shin, Junbum Shin, and Ge Yu. 2018b. PrivTrie: Effective frequent term discovery under local differential privacy. In 2018 IEEE 34th International Conference on Data Engineering (ICDE). IEEE, 821--832.Google ScholarCross Ref
- Ning Wang, Xiaokui Xiao, Yin Yang, Jun Zhao, Siu Cheung Hui, Hyejin Shin, Junbum Shin, and Ge Yu. 2019 b. Collecting and analyzing multidimensional data with local differential privacy. In 2019 IEEE 35th International Conference on Data Engineering (ICDE). IEEE, 638--649.Google ScholarCross Ref
- Shaowei Wang, Liusheng Huang, Pengzhan Wang, Yiwen Nie, Hongli Xu, Wei Yang, Xiang-Yang Li, and Chunming Qiao. 2016a. Mutual information optimally local private discrete distribution estimation. arXiv preprint arXiv:1607.08025 (2016).Google Scholar
- Tianhao Wang, Jeremiah Blocki, Ninghui Li, and Somesh Jha. 2017. Locally differentially private protocols for frequency estimation. In 26th $$USENIX$$ Security Symposium ($$USENIX$$ Security 17). 729--745.Google Scholar
- Tianhao Wang, Ninghui Li, and Somesh Jha. 2018a. Locally differentially private frequent itemset mining. In 2018 IEEE Symposium on Security and Privacy (SP). IEEE, 127--143.Google ScholarCross Ref
- Tianhao Wang, Ninghui Li, and Somesh Jha. 2019 a. Locally differentially private heavy hitter identification. IEEE Transactions on Dependable and Secure Computing (2019).Google Scholar
- Yue Wang, Xintao Wu, and Donghui Hu. 2016b. Using Randomized Response for Differential Privacy Preserving Data Collection.. In EDBT/ICDT Workshops, Vol. 1558. 0090--6778.Google Scholar
- Bang Wu, Xiangwen Yang, Shirui Pan, and Xingliang Yuan. 2020 b. Model Extraction Attacks on Graph Neural Networks: Taxonomy and Realization. arXiv preprint arXiv:2010.12751 (2020).Google Scholar
- Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020 a. A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems (2020).Google Scholar
- Depeng Xu, Shuhan Yuan, Xintao Wu, and HaiNhat Phan. 2018c. DPNE: Differentially private network embedding. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 235--246.Google ScholarDigital Library
- Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2018a. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826 (2018).Google Scholar
- Keyulu Xu, Chengtao Li, Yonglong Tian, Tomohiro Sonobe, Ken-ichi Kawarabayashi, and Stefanie Jegelka. 2018b. Representation Learning on Graphs with Jumping Knowledge Networks. In Proceedings of the 35th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 80), Jennifer Dy and Andreas Krause (Eds.). PMLR, Stockholmsmässan, Stockholm Sweden, 5453--5462.Google Scholar
- Zhilin Yang, William W Cohen, and Ruslan Salakhutdinov. 2016. Revisiting semi-supervised learning with graph embeddings. arXiv preprint arXiv:1603.08861 (2016).Google ScholarDigital Library
- Min Ye and Alexander Barg. 2018. Optimal schemes for discrete distribution estimation under locally differential privacy. IEEE Transactions on Information Theory, Vol. 64, 8 (2018), 5662--5676.Google ScholarDigital Library
- Kun Yi and Jianxin Wu. 2019. Probabilistic end-to-end noise correction for learning with noisy labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7017--7025.Google ScholarCross Ref
- Hongyi Zhang, Moustapha Cisse, Yann N Dauphin, and David Lopez-Paz. 2017. mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017).Google Scholar
- Muhan Zhang and Yixin Chen. 2018. Link prediction based on graph neural networks. In Advances in Neural Information Processing Systems. 5165--5175.Google Scholar
- Sen Zhang and Weiwei Ni. 2019. Graph Embedding Matrix Sharing With Differential Privacy. IEEE Access, Vol. 7 (2019), 89390--89399.Google ScholarCross Ref
- Zhilu Zhang and Mert R Sabuncu. 2018. Generalized cross entropy loss for training deep neural networks with noisy labels. arXiv preprint arXiv:1805.07836 (2018).Google Scholar
- Jun Zhou, Chaochao Chen, Longfei Zheng, Xiaolin Zheng, Bingzhe Wu, Ziqi Liu, and Li Wang. 2020. Privacy-Preserving Graph Neural Network for Node Classification. arXiv preprint arXiv:2005.11903 (2020).Google Scholar
Index Terms
- Locally Private Graph Neural Networks
Recommendations
Locally and Structurally Private Graph Neural Networks
Graph Neural Networks (GNNs) are known to address such tasks over graph-structured data, which is widely used to represent many real-world systems. The collection and analysis of graph data using GNNs raise significant privacy concerns regarding ...
Label-Consistency based Graph Neural Networks for Semi-supervised Node Classification
SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information RetrievalGraph neural networks (GNNs) achieve remarkable success in graph-based semi-supervised node classification, leveraging the information from neighboring nodes to improve the representation learning of target node. The success of GNNs at node ...
A noise-resistant graph neural network by semi-supervised contrastive learning
AbstractGraph neural networks (GNNs) have been widely applied for representation learning on the graph data in real applications, but few of them are designed to conduct representation learning on the graph data with noisy labels. Its key challenge is ...
Highlights- Propose a new noise-resistant graph neural network to conduct representation learning on the graph data with noisy labels.
- Propose a semi-supervised contrastive learning constraint to push noisy nodes away from unlabeled nodes in ...
Comments