ABSTRACT
Graph contrastive learning often faces challenges when data augmentations compromise the graph's critical attributes, introducing the risk of generating noise-positive pairs. Although recent methods have attempted to address these issues, they either fall short of ensuring effective data augmentation or suffer from excessive computational demands. The advent of full-attention graph Transformers, with their enhanced capacity for graph representation learning, has sparked significant interest. Despite their potential, employing full-attention graph Transformers for contrastive learning can introduce issues such as noisy redundancies. In this work, we propose the Graph Attention Contrastive Learning (GACL) model, which innovatively combines a full-attention transformer with a message-passing graph neural network as its encoder. To mitigate the noise associated with full-attention mechanisms, we apply a denoising modification. Our GACL model effectively tackles the challenges associated with full-attention mechanisms and introduces a novel approach for data augmentation. Moreover, we propose the concept of effective mutual information to theoretically underpin our methodology. Utilizing this framework, we explore the impact of the denoising matrix within GACL's contrastive learning process and delve into comprehensive discussions on its implications. Empirical assessments underscore GACL's exceptional performance, establishing it as a state-of-the-art solution in graph contrastive learning.
Supplemental Material
- Bijaya Adhikari, Yao Zhang, Naren Ramakrishnan, and B Aditya Prakash. 2018. Sub2vec: Feature learning for subgraphs. In Advances in Knowledge Discovery and Data Mining: 22nd Pacific-Asia Conference, PAKDD 2018, Melbourne, VIC, Australia, June 3--6, 2018, Proceedings, Part II 22. Springer, 170--182.Google ScholarDigital Library
- Uri Alon and Eran Yahav. 2020. On the bottleneck of graph neural networks and its practical implications. arXiv preprint arXiv:2006.05205 (2020).Google Scholar
- Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer normalization. arXiv preprint arXiv:1607.06450 (2016).Google Scholar
- Hongyun Cai, Vincent W Zheng, and Kevin Chen-Chuan Chang. 2018. A comprehensive survey of graph embedding: Problems, techniques, and applications. IEEE TKDE, Vol. 30, 9 (2018), 1616--1637.Google Scholar
- Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In International conference on machine learning. PMLR, 1597--1607.Google Scholar
- Vijay Prakash Dwivedi and Xavier Bresson. 2020. A generalization of transformer networks to graphs. arXiv preprint arXiv:2012.09699 (2020).Google Scholar
- Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In KDD. 855--864.Google Scholar
- Zhonghui Gu, Xiao Luo, Jiaxiao Chen, Minghua Deng, and Luhua Lai. 2023. Hierarchical graph transformer with contrastive learning for protein function prediction. Bioinformatics, Vol. 39, 7 (2023), btad410.Google Scholar
- Michael U Gutmann and Aapo Hyv"arinen. 2012. Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics. Journal of machine learning research, Vol. 13, 2 (2012).Google Scholar
- Kaveh Hassani and Amir Hosein Khasahmadi. 2020. Contrastive multi-view representation learning on graphs. In International conference on machine learning. PMLR, 4116--4126.Google Scholar
- Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9729--9738.Google ScholarCross Ref
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarCross Ref
- R Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Phil Bachman, Adam Trischler, and Yoshua Bengio. 2018. Learning deep representations by mutual information estimation and maximization. arXiv preprint arXiv:1808.06670 (2018).Google Scholar
- Thomas N Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In ICLR.Google Scholar
- Devin Kreuzer, Dominique Beaini, Will Hamilton, Vincent Létourneau, and Prudencio Tossou. 2021. Rethinking graph transformers with spectral attention. Advances in Neural Information Processing Systems , Vol. 34 (2021), 21618--21629.Google Scholar
- Huidong Liang, Xingjian Du, Bilei Zhu, Zejun Ma, Ke Chen, and Junbin Gao. 2023. Graph contrastive learning with implicit augmentations. Neural Networks , Vol. 163 (2023), 156--164.Google ScholarDigital Library
- Ralph Linsker. 1988. Self-organization in a perceptual network. Computer, Vol. 21, 3 (1988), 105--117.Google ScholarDigital Library
- Yixin Liu, Ming Jin, Shirui Pan, Chuan Zhou, Yu Zheng, Feng Xia, and S Yu Philip. 2022. Graph self-supervised learning: A survey. IEEE Transactions on Knowledge and Data Engineering, Vol. 35, 6 (2022), 5879--5900.Google Scholar
- Zemin Liu, Yuan Fang, Chenghao Liu, and Steven CH Hoi. 2021. Node-wise localization of graph neural networks. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence. 1520--1526.Google ScholarCross Ref
- Qiheng Mao, Zemin Liu, Chenghao Liu, Zhuo Li, and Jianling Sun. 2024. Advancing Graph Representation Learning with Large Language Models: A Comprehensive Survey of Techniques. arXiv preprint arXiv:2402.05952 (2024).Google Scholar
- Qiheng Mao, Zemin Liu, Chenghao Liu, and Jianling Sun. 2023. Hinormer: Representation learning on heterogeneous information networks with graph transformer. In Proceedings of the ACM Web Conference 2023. 599--610.Google ScholarDigital Library
- Christopher Morris, Nils M Kriege, Franka Bause, Kristian Kersting, Petra Mutzel, and Marion Neumann. 2020. Tudataset: A collection of benchmark datasets for learning with graphs. arXiv preprint arXiv:2007.08663 (2020).Google Scholar
- Annamalai Narayanan, Mahinthan Chandramohan, Rajasekar Venkatesan, Lihui Chen, Yang Liu, and Shantanu Jaiswal. 2017. graph2vec: Learning distributed representations of graphs. arXiv preprint arXiv:1707.05005 (2017).Google Scholar
- Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: Online learning of social representations. In KDD. 701--710.Google Scholar
- Ladislav Rampávs ek, Mikhail Galkin, Vijay Prakash Dwivedi, Anh Tuan Luu, Guy Wolf, and Dominique Beaini. 2022. Recipe for a General, Powerful, Scalable Graph Transformer. arXiv preprint arXiv:2205.12454 (2022).Google Scholar
- Yu Rong, Yatao Bian, Tingyang Xu, Weiyang Xie, Ying Wei, Wenbing Huang, and Junzhou Huang. 2020. Self-supervised graph transformer on large-scale molecular data. Advances in Neural Information Processing Systems , Vol. 33 (2020), 12559--12571.Google Scholar
- Claude Elwood Shannon. 1948. A mathematical theory of communication. The Bell system technical journal, Vol. 27, 3 (1948), 379--423.Google Scholar
- Yucheng Shi, Kaixiong Zhou, and Ninghao Liu. 2023. Engage: Explanation guided data augmentation for graph representation learning. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 104--121.Google ScholarDigital Library
- Fan-Yun Sun, Jordan Hoffmann, Vikas Verma, and Jian Tang. 2020. InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization. In ICLR.Google Scholar
- Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. LINE: Large-scale information network embedding. In WWW. 1067--1077.Google ScholarDigital Library
- Yonglong Tian, Chen Sun, Ben Poole, Dilip Krishnan, Cordelia Schmid, and Phillip Isola. 2020. What makes for good views for contrastive learning? Advances in neural information processing systems , Vol. 33 (2020), 6827--6839.Google Scholar
- Jake Topping, Francesco Di Giovanni, Benjamin Paul Chamberlain, Xiaowen Dong, and Michael M Bronstein. 2021. Understanding over-squashing and bottlenecks on graphs via curvature. arXiv preprint arXiv:2111.14522 (2021).Google Scholar
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems , Vol. 30 (2017).Google Scholar
- Petar Velivc ković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018a. Graph attention networks. In ICLR.Google Scholar
- Petar Velivc ković, William Fedus, William L Hamilton, Pietro Liò, Yoshua Bengio, and R Devon Hjelm. 2018b. Deep graph infomax. arXiv preprint arXiv:1809.10341 (2018).Google Scholar
- Lu Wang, Xiaofu Chang, Shuang Li, Yunfei Chu, Hui Li, Wei Zhang, Xiaofeng He, Le Song, Jingren Zhou, and Hongxia Yang. 2021. Tcl: Transformer-based dynamic graph modelling via contrastive learning. arXiv preprint arXiv:2105.07944 (2021).Google Scholar
- Zhanghao Wu, Paras Jain, Matthew Wright, Azalia Mirhoseini, Joseph E Gonzalez, and Ion Stoica. 2021. Representing long-range context for graph neural networks with global attention. Advances in Neural Information Processing Systems , Vol. 34 (2021), 13266--13279.Google Scholar
- Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020. A comprehensive survey on graph neural networks. IEEE TNNLS, Vol. 32, 1 (2020), 4--24.Google Scholar
- Jun Xia, Lirong Wu, Jintao Chen, Bozhen Hu, and Stan Z Li. 2022a. Simgrace: A simple framework for graph contrastive learning without data augmentation. In Proceedings of the ACM Web Conference 2022. 1070--1079.Google ScholarDigital Library
- Jun Xia, Yanqiao Zhu, Yuanqi Du, and Stan Z Li. 2022b. A survey of pretraining on graphs: Taxonomy, methods, and applications. arXiv preprint arXiv:2202.07893 (2022).Google Scholar
- Dongkuan Xu, Wei Cheng, Dongsheng Luo, Haifeng Chen, and Xiang Zhang. 2021. Infogcl: Information-aware graph contrastive learning. Advances in Neural Information Processing Systems , Vol. 34 (2021), 30414--30425.Google Scholar
- Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How powerful are graph neural networks?. In ICLR.Google Scholar
- Yihang Yin, Qingzhong Wang, Siyu Huang, Haoyi Xiong, and Xiang Zhang. 2022. Autogcl: Automated graph contrastive learning via learnable view generators. In Proceedings of the AAAI conference on artificial intelligence, Vol. 36. 8892--8900.Google ScholarCross Ref
- Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng, Guolin Ke, Di He, Yanming Shen, and Tie-Yan Liu. 2021. Do transformers really perform badly for graph representation? Advances in Neural Information Processing Systems , Vol. 34 (2021), 28877--28888.Google Scholar
- Yuning You, Tianlong Chen, Yang Shen, and Zhangyang Wang. 2021. Graph contrastive learning automated. In International Conference on Machine Learning. PMLR, 12121--12132.Google Scholar
- Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, and Yang Shen. 2020. Graph contrastive learning with augmentations. Advances in neural information processing systems , Vol. 33 (2020), 5812--5823.Google Scholar
- Chun-Yang Zhang, Wu-Peng Fang, Hai-Chun Cai, CL Philip Chen, and Yue-Na Lin. 2022. Sparse Graph Transformer With Contrastive Learning. IEEE Transactions on Computational Social Systems (2022).Google Scholar
- Yanqiao Zhu, Yichen Xu, Feng Yu, Qiang Liu, Shu Wu, and Liang Wang. 2020. Deep graph contrastive representation learning. arXiv preprint arXiv:2006.04131 (2020).Google Scholar
- Yanqiao Zhu, Yichen Xu, Feng Yu, Qiang Liu, Shu Wu, and Liang Wang. 2021. Graph contrastive learning with adaptive augmentation. In Proceedings of the Web Conference 2021. 2069--2080. ioGoogle ScholarDigital Library
Index Terms
- Full-Attention Driven Graph Contrastive Learning: with Effective Mutual Information Insight
Recommendations
Graph Contrastive Learning with Cohesive Subgraph Awareness
WWW '24: Proceedings of the ACM on Web Conference 2024Graph contrastive learning (GCL) has emerged as a state-of-the-art strategy for learning representations of diverse graphs including social and biomedical networks. GCL widely uses stochastic graph topology augmentation, such as uniform node dropping, to ...
Cross-view graph contrastive learning with hypergraph
AbstractGraph contrastive learning (GCL) provides a new perspective to alleviate the reliance on labeled data for graph representation learning. Recent efforts on GCL leverage various graph augmentation strategies, i.e., node dropping and edge masking, ...
Highlights- We proposed that hypergraphs are used as a paradigm to enhance graph contrastive learning.
- We propose a novel diffusion model-based fusion mechanism that aligns the positive examples.
- Our experimental results all exceed existing ...
Dual Space Graph Contrastive Learning
WWW '22: Proceedings of the ACM Web Conference 2022Unsupervised graph representation learning has emerged as a powerful tool to address real-world problems and achieves huge success in the graph learning domain. Graph contrastive learning is one of the unsupervised graph representation learning methods, ...
Comments