skip to main content
10.1145/3589334.3645717acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article
Open Access

Full-Attention Driven Graph Contrastive Learning: with Effective Mutual Information Insight

Published:13 May 2024Publication History

ABSTRACT

Graph contrastive learning often faces challenges when data augmentations compromise the graph's critical attributes, introducing the risk of generating noise-positive pairs. Although recent methods have attempted to address these issues, they either fall short of ensuring effective data augmentation or suffer from excessive computational demands. The advent of full-attention graph Transformers, with their enhanced capacity for graph representation learning, has sparked significant interest. Despite their potential, employing full-attention graph Transformers for contrastive learning can introduce issues such as noisy redundancies. In this work, we propose the Graph Attention Contrastive Learning (GACL) model, which innovatively combines a full-attention transformer with a message-passing graph neural network as its encoder. To mitigate the noise associated with full-attention mechanisms, we apply a denoising modification. Our GACL model effectively tackles the challenges associated with full-attention mechanisms and introduces a novel approach for data augmentation. Moreover, we propose the concept of effective mutual information to theoretically underpin our methodology. Utilizing this framework, we explore the impact of the denoising matrix within GACL's contrastive learning process and delve into comprehensive discussions on its implications. Empirical assessments underscore GACL's exceptional performance, establishing it as a state-of-the-art solution in graph contrastive learning.

Skip Supplemental Material Section

Supplemental Material

rfp2509.mp4

Supplemental video

mp4

18 MB

References

  1. Bijaya Adhikari, Yao Zhang, Naren Ramakrishnan, and B Aditya Prakash. 2018. Sub2vec: Feature learning for subgraphs. In Advances in Knowledge Discovery and Data Mining: 22nd Pacific-Asia Conference, PAKDD 2018, Melbourne, VIC, Australia, June 3--6, 2018, Proceedings, Part II 22. Springer, 170--182.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Uri Alon and Eran Yahav. 2020. On the bottleneck of graph neural networks and its practical implications. arXiv preprint arXiv:2006.05205 (2020).Google ScholarGoogle Scholar
  3. Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer normalization. arXiv preprint arXiv:1607.06450 (2016).Google ScholarGoogle Scholar
  4. Hongyun Cai, Vincent W Zheng, and Kevin Chen-Chuan Chang. 2018. A comprehensive survey of graph embedding: Problems, techniques, and applications. IEEE TKDE, Vol. 30, 9 (2018), 1616--1637.Google ScholarGoogle Scholar
  5. Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In International conference on machine learning. PMLR, 1597--1607.Google ScholarGoogle Scholar
  6. Vijay Prakash Dwivedi and Xavier Bresson. 2020. A generalization of transformer networks to graphs. arXiv preprint arXiv:2012.09699 (2020).Google ScholarGoogle Scholar
  7. Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In KDD. 855--864.Google ScholarGoogle Scholar
  8. Zhonghui Gu, Xiao Luo, Jiaxiao Chen, Minghua Deng, and Luhua Lai. 2023. Hierarchical graph transformer with contrastive learning for protein function prediction. Bioinformatics, Vol. 39, 7 (2023), btad410.Google ScholarGoogle Scholar
  9. Michael U Gutmann and Aapo Hyv"arinen. 2012. Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics. Journal of machine learning research, Vol. 13, 2 (2012).Google ScholarGoogle Scholar
  10. Kaveh Hassani and Amir Hosein Khasahmadi. 2020. Contrastive multi-view representation learning on graphs. In International conference on machine learning. PMLR, 4116--4126.Google ScholarGoogle Scholar
  11. Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9729--9738.Google ScholarGoogle ScholarCross RefCross Ref
  12. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarGoogle ScholarCross RefCross Ref
  13. R Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Phil Bachman, Adam Trischler, and Yoshua Bengio. 2018. Learning deep representations by mutual information estimation and maximization. arXiv preprint arXiv:1808.06670 (2018).Google ScholarGoogle Scholar
  14. Thomas N Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In ICLR.Google ScholarGoogle Scholar
  15. Devin Kreuzer, Dominique Beaini, Will Hamilton, Vincent Létourneau, and Prudencio Tossou. 2021. Rethinking graph transformers with spectral attention. Advances in Neural Information Processing Systems , Vol. 34 (2021), 21618--21629.Google ScholarGoogle Scholar
  16. Huidong Liang, Xingjian Du, Bilei Zhu, Zejun Ma, Ke Chen, and Junbin Gao. 2023. Graph contrastive learning with implicit augmentations. Neural Networks , Vol. 163 (2023), 156--164.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Ralph Linsker. 1988. Self-organization in a perceptual network. Computer, Vol. 21, 3 (1988), 105--117.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Yixin Liu, Ming Jin, Shirui Pan, Chuan Zhou, Yu Zheng, Feng Xia, and S Yu Philip. 2022. Graph self-supervised learning: A survey. IEEE Transactions on Knowledge and Data Engineering, Vol. 35, 6 (2022), 5879--5900.Google ScholarGoogle Scholar
  19. Zemin Liu, Yuan Fang, Chenghao Liu, and Steven CH Hoi. 2021. Node-wise localization of graph neural networks. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence. 1520--1526.Google ScholarGoogle ScholarCross RefCross Ref
  20. Qiheng Mao, Zemin Liu, Chenghao Liu, Zhuo Li, and Jianling Sun. 2024. Advancing Graph Representation Learning with Large Language Models: A Comprehensive Survey of Techniques. arXiv preprint arXiv:2402.05952 (2024).Google ScholarGoogle Scholar
  21. Qiheng Mao, Zemin Liu, Chenghao Liu, and Jianling Sun. 2023. Hinormer: Representation learning on heterogeneous information networks with graph transformer. In Proceedings of the ACM Web Conference 2023. 599--610.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Christopher Morris, Nils M Kriege, Franka Bause, Kristian Kersting, Petra Mutzel, and Marion Neumann. 2020. Tudataset: A collection of benchmark datasets for learning with graphs. arXiv preprint arXiv:2007.08663 (2020).Google ScholarGoogle Scholar
  23. Annamalai Narayanan, Mahinthan Chandramohan, Rajasekar Venkatesan, Lihui Chen, Yang Liu, and Shantanu Jaiswal. 2017. graph2vec: Learning distributed representations of graphs. arXiv preprint arXiv:1707.05005 (2017).Google ScholarGoogle Scholar
  24. Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: Online learning of social representations. In KDD. 701--710.Google ScholarGoogle Scholar
  25. Ladislav Rampávs ek, Mikhail Galkin, Vijay Prakash Dwivedi, Anh Tuan Luu, Guy Wolf, and Dominique Beaini. 2022. Recipe for a General, Powerful, Scalable Graph Transformer. arXiv preprint arXiv:2205.12454 (2022).Google ScholarGoogle Scholar
  26. Yu Rong, Yatao Bian, Tingyang Xu, Weiyang Xie, Ying Wei, Wenbing Huang, and Junzhou Huang. 2020. Self-supervised graph transformer on large-scale molecular data. Advances in Neural Information Processing Systems , Vol. 33 (2020), 12559--12571.Google ScholarGoogle Scholar
  27. Claude Elwood Shannon. 1948. A mathematical theory of communication. The Bell system technical journal, Vol. 27, 3 (1948), 379--423.Google ScholarGoogle Scholar
  28. Yucheng Shi, Kaixiong Zhou, and Ninghao Liu. 2023. Engage: Explanation guided data augmentation for graph representation learning. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 104--121.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Fan-Yun Sun, Jordan Hoffmann, Vikas Verma, and Jian Tang. 2020. InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization. In ICLR.Google ScholarGoogle Scholar
  30. Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. LINE: Large-scale information network embedding. In WWW. 1067--1077.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Yonglong Tian, Chen Sun, Ben Poole, Dilip Krishnan, Cordelia Schmid, and Phillip Isola. 2020. What makes for good views for contrastive learning? Advances in neural information processing systems , Vol. 33 (2020), 6827--6839.Google ScholarGoogle Scholar
  32. Jake Topping, Francesco Di Giovanni, Benjamin Paul Chamberlain, Xiaowen Dong, and Michael M Bronstein. 2021. Understanding over-squashing and bottlenecks on graphs via curvature. arXiv preprint arXiv:2111.14522 (2021).Google ScholarGoogle Scholar
  33. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems , Vol. 30 (2017).Google ScholarGoogle Scholar
  34. Petar Velivc ković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018a. Graph attention networks. In ICLR.Google ScholarGoogle Scholar
  35. Petar Velivc ković, William Fedus, William L Hamilton, Pietro Liò, Yoshua Bengio, and R Devon Hjelm. 2018b. Deep graph infomax. arXiv preprint arXiv:1809.10341 (2018).Google ScholarGoogle Scholar
  36. Lu Wang, Xiaofu Chang, Shuang Li, Yunfei Chu, Hui Li, Wei Zhang, Xiaofeng He, Le Song, Jingren Zhou, and Hongxia Yang. 2021. Tcl: Transformer-based dynamic graph modelling via contrastive learning. arXiv preprint arXiv:2105.07944 (2021).Google ScholarGoogle Scholar
  37. Zhanghao Wu, Paras Jain, Matthew Wright, Azalia Mirhoseini, Joseph E Gonzalez, and Ion Stoica. 2021. Representing long-range context for graph neural networks with global attention. Advances in Neural Information Processing Systems , Vol. 34 (2021), 13266--13279.Google ScholarGoogle Scholar
  38. Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020. A comprehensive survey on graph neural networks. IEEE TNNLS, Vol. 32, 1 (2020), 4--24.Google ScholarGoogle Scholar
  39. Jun Xia, Lirong Wu, Jintao Chen, Bozhen Hu, and Stan Z Li. 2022a. Simgrace: A simple framework for graph contrastive learning without data augmentation. In Proceedings of the ACM Web Conference 2022. 1070--1079.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Jun Xia, Yanqiao Zhu, Yuanqi Du, and Stan Z Li. 2022b. A survey of pretraining on graphs: Taxonomy, methods, and applications. arXiv preprint arXiv:2202.07893 (2022).Google ScholarGoogle Scholar
  41. Dongkuan Xu, Wei Cheng, Dongsheng Luo, Haifeng Chen, and Xiang Zhang. 2021. Infogcl: Information-aware graph contrastive learning. Advances in Neural Information Processing Systems , Vol. 34 (2021), 30414--30425.Google ScholarGoogle Scholar
  42. Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How powerful are graph neural networks?. In ICLR.Google ScholarGoogle Scholar
  43. Yihang Yin, Qingzhong Wang, Siyu Huang, Haoyi Xiong, and Xiang Zhang. 2022. Autogcl: Automated graph contrastive learning via learnable view generators. In Proceedings of the AAAI conference on artificial intelligence, Vol. 36. 8892--8900.Google ScholarGoogle ScholarCross RefCross Ref
  44. Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng, Guolin Ke, Di He, Yanming Shen, and Tie-Yan Liu. 2021. Do transformers really perform badly for graph representation? Advances in Neural Information Processing Systems , Vol. 34 (2021), 28877--28888.Google ScholarGoogle Scholar
  45. Yuning You, Tianlong Chen, Yang Shen, and Zhangyang Wang. 2021. Graph contrastive learning automated. In International Conference on Machine Learning. PMLR, 12121--12132.Google ScholarGoogle Scholar
  46. Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, and Yang Shen. 2020. Graph contrastive learning with augmentations. Advances in neural information processing systems , Vol. 33 (2020), 5812--5823.Google ScholarGoogle Scholar
  47. Chun-Yang Zhang, Wu-Peng Fang, Hai-Chun Cai, CL Philip Chen, and Yue-Na Lin. 2022. Sparse Graph Transformer With Contrastive Learning. IEEE Transactions on Computational Social Systems (2022).Google ScholarGoogle Scholar
  48. Yanqiao Zhu, Yichen Xu, Feng Yu, Qiang Liu, Shu Wu, and Liang Wang. 2020. Deep graph contrastive representation learning. arXiv preprint arXiv:2006.04131 (2020).Google ScholarGoogle Scholar
  49. Yanqiao Zhu, Yichen Xu, Feng Yu, Qiang Liu, Shu Wu, and Liang Wang. 2021. Graph contrastive learning with adaptive augmentation. In Proceedings of the Web Conference 2021. 2069--2080. ioGoogle ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Full-Attention Driven Graph Contrastive Learning: with Effective Mutual Information Insight

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          WWW '24: Proceedings of the ACM on Web Conference 2024
          May 2024
          4826 pages
          ISBN:9798400701719
          DOI:10.1145/3589334

          Copyright © 2024 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 13 May 2024

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate1,899of8,196submissions,23%
        • Article Metrics

          • Downloads (Last 12 months)88
          • Downloads (Last 6 weeks)86

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader