research-article

Free Access

Memory Disagreement: A Pseudo-Labeling Measure from Training Dynamics for Semi-supervised Graph Learning

Authors:
Hongbin Pei

MOE KLINNS Lab, Xi'an Jiaotong University, Xi'an, China

MOE KLINNS Lab, Xi'an Jiaotong University, Xi'an, China

0000-0002-7157-9959
View Profile

,
Yuheng Xiong

MOE KLINNS Lab, Xi'an Jiaotong University, Xi'an, China

MOE KLINNS Lab, Xi'an Jiaotong University, Xi'an, China

0009-0007-6794-6908
View Profile

,
Pinghui Wang

MOE KLINNS Lab, Xi'an Jiaotong University, Xi'an, China

MOE KLINNS Lab, Xi'an Jiaotong University, Xi'an, China

0000-0002-1434-837X
View Profile

,
Jing Tao

MOE KLINNS Lab, Xi'an Jiaotong University, Xi'an, China

MOE KLINNS Lab, Xi'an Jiaotong University, Xi'an, China

0009-0009-3911-4260
View Profile

,
Jialun Liu

Baidu Inc., Beijing, China

Baidu Inc., Beijing, China

0000-0002-6879-7648
View Profile

,
Huiqi Deng

Shanghai Jiao Tong University, Shanghai, China

Shanghai Jiao Tong University, Shanghai, China

0009-0000-0754-0577
View Profile

,
Jie Ma

MOE KLINNS Lab, Xi'an Jiaotong University, Xi'an, China

MOE KLINNS Lab, Xi'an Jiaotong University, Xi'an, China

0000-0002-7432-3238
View Profile

,
Xiaohong Guan

MOE KLINNS Lab, Xi'an Jiaotong University, Xi'an, China

MOE KLINNS Lab, Xi'an Jiaotong University, Xi'an, China

0000-0002-8826-0362
View Profile

Authors Info & Claims

WWW '24: Proceedings of the ACM on Web Conference 2024May 2024Pages 434–445https://doi.org/10.1145/3589334.3645398

Published:13 May 2024Publication History

WWW '24: Proceedings of the ACM on Web Conference 2024

Pages 434–445

ABSTRACT

In the realm of semi-supervised graph learning, pseudo-labeling is a pivotal strategy to utilize both labeled and unlabeled nodes for model training. Currently, confidence score is the most frequently used pseudo-labeling measure, however, it suffers from poor calibration and issues in out-of-distribution data. In this paper, we propose memory disagreement (MoDis for short), a novel uncertainty measure for pseudo-labeling. We uncover that training dynamics offer significant insights into prediction uncertainty --- if a graph model makes consistent predictions for an unlabeled node throughout training, the corresponding predicted label is likely to be correct. Thus, the node should be suitable for pseudo-labeling. The basic idea is supported by recent studies on training dynamics. We implement MoDis as the entropy of an accumulated distribution that summarizes the disagreement of the model's predictions throughout training. We further enhance and analyze MoDis in case studies, which show nodes with low MoDis are suitable for pseudo-labeling as these nodes tend to be distant from boundaries in both graph and representation space. We design MoDis based pseudo-label selection algorithm and corresponding pseudo-labeling algorithm, which are applicable to various graph neural networks. We empirically validate MoDis on eight benchmark graph datasets. The experimental results show that pseudo labels given by MoDis have better quality in correctness and information gain, and the algorithm benefits various graph neural networks, achieving an average relative improvement of 3.11% and reaching up to 30.24% when compared to the wildly-used uncertainty measure, confidence score. Moreover, we demonstrate the efficacy of MoDis on out-of-distribution nodes.

Supplemental Material

rfp0479.mp4

Supplemental video

mp4

126.7 MB

Download

References

Moloud Abdar, Farhad Pourpanah, Sadiq Hussain, Dana Rezazadegan, Li Liu, Mohammad Ghavamzadeh, Paul Fieguth, Xiaochun Cao, Abbas Khosravi, U Rajendra Acharya, et al. 2021. A review of uncertainty quantification in deep learning: Techniques, applications and challenges. Information fusion , Vol. 76 (2021), 243--297.Google Scholar
Devansh Arpit, Stanislaw Jastrzebski, Nicolas Ballas, David Krueger, Emmanuel Bengio, Maxinder S. Kanwal, Tegan Maharaj, Asja Fischer, Aaron Courville, Yoshua Bengio, and Simon Lacoste-Julien. 2017. A Closer Look at Memorization in Deep Networks. In Proceedings of the 34th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 70). PMLR, 233--242.Google Scholar
David Berthelot, Nicholas Carlini, Ian Goodfellow, Nicolas Papernot, Avital Oliver, and Colin A Raffel. 2019. Mixmatch: A holistic approach to semi-supervised learning. Advances in neural information processing systems , Vol. 32 (2019).Google Scholar
Aleksandar Bojchevski and Stephan Günnemann. 2018. Deep Gaussian Embedding of Graphs: Unsupervised Inductive Learning via Ranking. In International Conference on Learning Representations. 1--13.Google Scholar
Paola Cascante-Bonilla, Fuwen Tan, Yanjun Qi, and Vicente Ordonez. 2021. Curriculum labeling: Revisiting pseudo-labeling for semi-supervised learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 6912--6920.Google ScholarCross Ref
Olivier Chapelle and Alexander Zien. 2005. Semi-supervised classification by low density separation. In International workshop on artificial intelligence and statistics. PMLR, 57--64.Google Scholar
Ming Chen, Zhewei Wei, Zengfeng Huang, Bolin Ding, and Yaliang Li. 2020. Simple and deep graph convolutional networks. In International conference on machine learning. PMLR, 1725--1735.Google Scholar
A Philip Dawid. 1982. The well-calibrated Bayesian. J. Amer. Statist. Assoc. , Vol. 77, 379 (1982), 605--610.Google ScholarCross Ref
Emilio Dorigatti, Jann Goschenhofer, Benjamin Schubert, Mina Rezaei, and Bernd Bischl. 2022. Positive-Unlabeled Learning with Uncertainty-aware Pseudo-label Selection. https://openreview.net/forum?id=jJis-v9PzhjGoogle Scholar
Yarin Gal and Zoubin Ghahramani. 2016. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning. PMLR, 1050--1059.Google Scholar
Johannes Gasteiger, Aleksandar Bojchevski, and Stephan Günnemann. 2019. Predict then Propagate: Graph Neural Networks meet Personalized PageRank. In International Conference on Learning Representations.Google Scholar
Jakob Gawlikowski, Cedrique Rovile Njieutcheu Tassi, Mohsin Ali, Jongseok Lee, Matthias Humt, Jianxiang Feng, Anna Kruspe, Rudolph Triebel, Peter Jung, Ribana Roscher, et al. 2023. A survey of uncertainty in deep neural networks. Artificial Intelligence Review (2023), 1--77.Google Scholar
Git-repo. 2023. https://github.com/XJTU-Graph-Intelligence-Lab/MoDis-main/. Accessed: 2024-02--16.Google Scholar
Yves Grandvalet and Yoshua Bengio. 2004. Semi-supervised learning by entropy minimization. Advances in neural information processing systems , Vol. 17 (2004).Google Scholar
Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q Weinberger. 2017. On calibration of modern neural networks. In International conference on machine learning. PMLR, 1321--1330.Google Scholar
Dongxiao He, Jitao Zhao, Rui Guo, Zhiyong Feng, Di Jin, Yuxiao Huang, Zhen Wang, and Weixiong Zhang. 2023 b. Contrastive learning meets homophily: two birds with one stone. In International Conference on Machine Learning. PMLR, 12775--12789.Google Scholar
Haiyun He, Gholamali Aminian, Yuheng Bu, Miguel Rodrigues, and Vincent YF Tan. 2023 a. How Does Pseudo-Labeling Affect the Generalization Error of the Semi-Supervised Gibbs Algorithm?. In International Conference on Artificial Intelligence and Statistics. PMLR, 8494--8520.Google Scholar
Matthias Hein, Maksym Andriushchenko, and Julian Bitterwolf. 2019. Why relu networks yield high-confidence predictions far away from the training data and how to mitigate the problem. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 41--50.Google ScholarCross Ref
Zijian Hu, Zhengyu Yang, Xuefeng Hu, and Ram Nevatia. 2021. Simple: Similar pseudo label exploitation for semi-supervised classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15099--15108.Google ScholarCross Ref
Thomas N. Kipf and Max Welling. 2017a. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations. https://openreview.net/forum?id=SJU4ayYglGoogle Scholar
Thomas N. Kipf and Max Welling. 2017b. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations.Google Scholar
Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. 2017. Simple and scalable predictive uncertainty estimation using deep ensembles. Advances in neural information processing systems , Vol. 30 (2017).Google Scholar
Dong-Hyun Lee et al. 2013. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Workshop on challenges in representation learning, ICML, Vol. 3. 896.Google Scholar
Qimai Li, Zhichao Han, and Xiao-Ming Wu. 2018. Deeper insights into graph convolutional networks for semi-supervised learning. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.Google ScholarCross Ref
Yayong Li, Jie Yin, and Ling Chen. 2023. Informative pseudo-labeling for graph neural networks with few labels. Data Mining and Knowledge Discovery , Vol. 37, 1 (2023), 228--254.Google ScholarDigital Library
Hongrui Liu, Binbin Hu, Xiao Wang, Chuan Shi, Zhiqiang Zhang, and Jun Zhou. 2022. Confidence may cheat: Self-training on graph neural networks under distribution shift. In Proceedings of the ACM Web Conference 2022. 1248--1258.Google ScholarDigital Library
Galileo Namata, Ben London, Lise Getoor, Bert Huang, and U Edu. 2012. Query-driven active surveying for collective classification. In 10th international workshop on mining and learning with graphs, Vol. 8. 1.Google Scholar
Anh Nguyen, Jason Yosinski, and Jeff Clune. 2015. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In Proceedings of the IEEE conference on computer vision and pattern recognition. 427--436.Google ScholarCross Ref
Duc Tam Nguyen, Chaithanya Kumar Mummadi, Thi Phuong Nhung Ngo, Thi Hoai Phuong Nguyen, Laura Beggel, and Thomas Brox. 2020. SELF: Learning to Filter Noisy Labels with Self-Ensembling. In International Conference on Learning Representations. https://openreview.net/forum?id=HkgsPhNYPSGoogle Scholar
Hongbin Pei, Taile Chen, Chen A, Huiqi Deng, Jing Tao, Pinghui Wang, and Xiaohong Guan. 2024. HAGO-Net: Hierarchical Geometric Massage Passing for Molecular Representation Learning. In Proceedings of the AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
Hongbin Pei, Bingzhe Wei, Kevin Chang, Chunxu Zhang, and Bo Yang. 2020b. Curvature regularization to prevent distortion in graph embedding. Advances in Neural Information Processing Systems , Vol. 33 (2020), 20779--20790.Google Scholar
Hongbin Pei, Bingzhe Wei, Kevin Chen-Chuan Chang, Yu Lei, and Bo Yang. 2020a. Geom-GCN: Geometric Graph Convolutional Networks. In International Conference on Learning Representations.Google Scholar
Hongbin Pei, Bo Yang, Jiming Liu, and Kevin Chen-Chuan Chang. 2020c. Active surveillance via group sparse Bayesian learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 44, 3 (2020), 1133--1148.Google ScholarCross Ref
Oleg Platonov, Denis Kuznedelev, Artem Babenko, and Liudmila Prokhorenkova. 2022. Characterizing graph datasets for node classification: Beyond homophily-heterophily dichotomy. arXiv preprint arXiv:2209.06177 (2022).Google Scholar
Mamshad Nayeem Rizve, Kevin Duarte, Yogesh S Rawat, and Mubarak Shah. 2021. In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label Selection Framework for Semi-Supervised Learning. In International Conference on Learning Representations. https://openreview.net/forum?id=-ODN6SbiUUGoogle Scholar
Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Galligher, and Tina Eliassi-Rad. 2008. Collective classification in network data. AI magazine, Vol. 29, 3 (2008), 93--93.Google Scholar
Burr Settles, Mark Craven, and Soumya Ray. 2007. Multiple-Instance Active Learning. In Advances in Neural Information Processing Systems, J. Platt, D. Koller, Y. Singer, and S. Roweis (Eds.), Vol. 20. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2007/file/a1519de5b5d44b31a01de013b9b51a80-Paper.pdfGoogle Scholar
Oleksandr Shchur, Maximilian Mumme, Aleksandar Bojchevski, and Stephan Günnemann. 2018. Pitfalls of graph neural network evaluation. arXiv preprint arXiv:1811.05868 (2018).Google Scholar
Shoaib Ahmed Siddiqui, Nitarshan Rajkumar, Tegan Maharaj, David Krueger, and Sara Hooker. 2023. Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics. In The Eleventh International Conference on Learning Representations. https://openreview.net/forum?id=PvLnIaJbt9Google Scholar
Yu Song and Donglin Wang. 2022. Learning on Graphs with Out-of-Distribution Nodes. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1635--1645.Google ScholarDigital Library
Tiberiu Sosea and Cornelia Caragea. 2022. Leveraging Training Dynamics and Self-Training for Text Classification. In Findings of the Association for Computational Linguistics: EMNLP 2022. 4750--4762.Google ScholarCross Ref
Chuxiong Sun, Hongming Gu, and Jie Hu. 2021. Scalable and adaptive graph neural networks with self-label-enhanced training. arXiv preprint arXiv:2104.09376 (2021).Google Scholar
Ke Sun, Zhouchen Lin, and Zhanxing Zhu. 2020. Multi-stage self-supervised learning for graph convolutional networks on graphs with few labeled nodes. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 5892--5899.Google ScholarCross Ref
Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research , Vol. 9, 11 (2008).Google Scholar
Petar Velivc ković , Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In International Conference on Learning Representations.Google Scholar
Hao Wang, Defu Lian, Hanghang Tong, Qi Liu, Zhenya Huang, and Enhong Chen. 2021a. Hypersorec: Exploiting hyperbolic user and item representations with multiple aspects for social-aware recommendation. ACM Transactions on Information Systems (TOIS), Vol. 40, 2 (2021), 1--28.Google ScholarDigital Library
Hao Wang, Tong Xu, Qi Liu, Defu Lian, Enhong Chen, Dongfang Du, Han Wu, and Wen Su. 2019. MCNE: An end-to-end framework for learning multiple conditional network representations of social network. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 1064--1072.Google ScholarDigital Library
Min Wang, Hao Yang, and Qing Cheng. 2022. GCL: Graph Calibration Loss for Trustworthy Graph Neural Network. In Proceedings of the 30th ACM International Conference on Multimedia. 988--996.Google ScholarDigital Library
Xiao Wang, Hongrui Liu, Chuan Shi, and Cheng Yang. 2021b. Be confident! towards trustworthy graph neural networks via confidence calibration. Advances in Neural Information Processing Systems , Vol. 34 (2021), 23768--23779.Google Scholar
Yuxi Wang, Junran Peng, and ZhaoXiang Zhang. 2021c. Uncertainty-aware pseudo label refinery for domain adaptive semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9092--9101.Google ScholarCross Ref
David H Wolpert and William G Macready. 1997. No free lunch theorems for optimization. IEEE transactions on evolutionary computation, Vol. 1, 1 (1997), 67--82.Google ScholarDigital Library
Han Yang, Xiao Yan, Xinyan Dai, Yongqiang Chen, and James Cheng. 2021. Self-enhanced gnn: Improving graph neural networks using model outputs. In 2021 International Joint Conference on Neural Networks (IJCNN). IEEE, 1--8.Google ScholarCross Ref
Liang Yang, Yuanfang Guo, Junhua Gu, Di Jin, Bo Yang, and Xiaochun Cao. 2022. Probabilistic Graph Convolutional Network via Topology-Constrained Latent Space Model. IEEE Trans. Cybern. , Vol. 52, 4 (2022), 2123--2136. https://doi.org/10.1109/TCYB.2020.3005938Google ScholarCross Ref
Liang Yang, Zesheng Kang, Xiaochun Cao, Di Jin, Bo Yang, and Yuanfang Guo. 2019. Topology Optimization based Graph Convolutional Network. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10--16, 2019. 4054--4061. https://doi.org/10.24963/IJCAI.2019/563Google ScholarCross Ref
Liang Yang, Fan Wu, Junhua Gu, Chuan Wang, Xiaochun Cao, Di Jin, and Yuanfang Guo. 2020. Graph Attention Topic Modeling Network. In WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20--24, 2020. 144--154. https://doi.org/10.1145/3366423.3380102Google ScholarDigital Library
Ruiheng Zhang, Zhe Cao, Shuo Yang, Lingyu Si, Haoyang Sun, Lixin Xu, and Fuchun Sun. 2024. Cognition-Driven Structural Prior for Instance-Dependent Label Transition Matrix Estimation. IEEE Transactions on Neural Networks and Learning Systems (2024).Google ScholarCross Ref
Xujiang Zhao, Feng Chen, Shu Hu, and Jin-Hee Cho. 2020. Uncertainty aware semi-supervised learning on graph data. Advances in Neural Information Processing Systems , Vol. 33 (2020), 12827--12836.Google Scholar
Ziang Zhou, Jieming Shi, Shengzhong Zhang, Zengfeng Huang, and Qing Li. 2023. Effective stabilized self-training on few-labeled graph data. Information Sciences , Vol. 631 (2023), 369--384. ioGoogle ScholarDigital Library

Index Terms

Memory Disagreement: A Pseudo-Labeling Measure from Training Dynamics for Semi-supervised Graph Learning
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Learning latent representations
      2. Neural networks
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Algorithmic game theory and mechanism design
      1. Social networks
    2. Machine learning theory
      1. Semi-supervised learning

Recommendations

Sentiment labeling for extending initial labeled data to improve semi-supervised sentiment classification

Semi-supervised framework which exploits unsupervised approach (JST) is proposed.Self-training suffers from incorrectly labeling problem with insufficient data.Confidently predicted instances are labeled and used as training data by JST.Self-training ...
Read More
Instance selection in semi-supervised learning
Canadian AI'11: Proceedings of the 24th Canadian conference on Advances in artificial intelligence

Semi-supervised learning methods utilize abundant unlabeled data to help to learn a better classifier when the number of labeled instances is very small. A common method is to select and label unlabeled instances that the current classifier has high ...
Read More
Informative pseudo-labeling for graph neural networks with few labels
Abstract
Graph neural networks (GNNs) have achieved state-of-the-art results for semi-supervised node classification on graphs. Nevertheless, the challenge of how to effectively learn GNNs with very few labels is still under-explored. As one of the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '24: Proceedings of the ACM on Web Conference 2024
May 2024
4826 pages
ISBN:9798400701719
DOI:10.1145/3589334
General Chairs:
Tat-Seng Chua
National University of Singapore
,
Chong-Wah Ngo
Singapore Management University
,
Proceedings Chair:
Roy Ka-Wei Lee
Singapore University of Technology and Design
,
Program Chairs:
Ravi Kumar
Google
,
Hady W. Lauw
Singapore Management University
Copyright © 2024 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 May 2024
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
epistemic uncertainty
graph neural networks
self-training
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 43
  Total Downloads
- Downloads (Last 12 months)43
- Downloads (Last 6 weeks)43
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Memory Disagreement: A Pseudo-Labeling Measure from Training Dynamics for Semi-supervised Graph Learning

WWW '24: Proceedings of the ACM on Web Conference 2024

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Sentiment labeling for extending initial labeled data to improve semi-supervised sentiment classification

Instance selection in semi-supervised learning

Informative pseudo-labeling for graph neural networks with few labels

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Memory Disagreement: A Pseudo-Labeling Measure from Training Dynamics for Semi-supervised Graph Learning

WWW '24: Proceedings of the ACM on Web Conference 2024

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Sentiment labeling for extending initial labeled data to improve semi-supervised sentiment classification

Instance selection in semi-supervised learning

Informative pseudo-labeling for graph neural networks with few labels

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media