research-article

Open Access

Full-Attention Driven Graph Contrastive Learning: with Effective Mutual Information Insight

Authors:
Long Li

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China

0009-0002-4549-6559
View Profile

,
Zemin Liu

National University of Singapore, Singapore, Singapore

National University of Singapore, Singapore, Singapore

0000-0001-6262-9435
View Profile

,
Chenghao Liu

Salesforce Research Asia, Singapore, Singapore

Salesforce Research Asia, Singapore, Singapore

0000-0002-6934-2354
View Profile

,
Jianling Sun

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China

0000-0001-8799-6020
View Profile

Authors Info & Claims

WWW '24: Proceedings of the ACM on Web Conference 2024May 2024Pages 1069–1080https://doi.org/10.1145/3589334.3645717

Published:13 May 2024Publication History

WWW '24: Proceedings of the ACM on Web Conference 2024

Pages 1069–1080

ABSTRACT

Graph contrastive learning often faces challenges when data augmentations compromise the graph's critical attributes, introducing the risk of generating noise-positive pairs. Although recent methods have attempted to address these issues, they either fall short of ensuring effective data augmentation or suffer from excessive computational demands. The advent of full-attention graph Transformers, with their enhanced capacity for graph representation learning, has sparked significant interest. Despite their potential, employing full-attention graph Transformers for contrastive learning can introduce issues such as noisy redundancies. In this work, we propose the Graph Attention Contrastive Learning (GACL) model, which innovatively combines a full-attention transformer with a message-passing graph neural network as its encoder. To mitigate the noise associated with full-attention mechanisms, we apply a denoising modification. Our GACL model effectively tackles the challenges associated with full-attention mechanisms and introduces a novel approach for data augmentation. Moreover, we propose the concept of effective mutual information to theoretically underpin our methodology. Utilizing this framework, we explore the impact of the denoising matrix within GACL's contrastive learning process and delve into comprehensive discussions on its implications. Empirical assessments underscore GACL's exceptional performance, establishing it as a state-of-the-art solution in graph contrastive learning.

Supplemental Material

rfp2509.mp4

Supplemental video

mp4

18 MB

Download

References

Bijaya Adhikari, Yao Zhang, Naren Ramakrishnan, and B Aditya Prakash. 2018. Sub2vec: Feature learning for subgraphs. In Advances in Knowledge Discovery and Data Mining: 22nd Pacific-Asia Conference, PAKDD 2018, Melbourne, VIC, Australia, June 3--6, 2018, Proceedings, Part II 22. Springer, 170--182.Google ScholarDigital Library
Uri Alon and Eran Yahav. 2020. On the bottleneck of graph neural networks and its practical implications. arXiv preprint arXiv:2006.05205 (2020).Google Scholar
Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer normalization. arXiv preprint arXiv:1607.06450 (2016).Google Scholar
Hongyun Cai, Vincent W Zheng, and Kevin Chen-Chuan Chang. 2018. A comprehensive survey of graph embedding: Problems, techniques, and applications. IEEE TKDE, Vol. 30, 9 (2018), 1616--1637.Google Scholar
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In International conference on machine learning. PMLR, 1597--1607.Google Scholar
Vijay Prakash Dwivedi and Xavier Bresson. 2020. A generalization of transformer networks to graphs. arXiv preprint arXiv:2012.09699 (2020).Google Scholar
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In KDD. 855--864.Google Scholar
Zhonghui Gu, Xiao Luo, Jiaxiao Chen, Minghua Deng, and Luhua Lai. 2023. Hierarchical graph transformer with contrastive learning for protein function prediction. Bioinformatics, Vol. 39, 7 (2023), btad410.Google Scholar
Michael U Gutmann and Aapo Hyv"arinen. 2012. Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics. Journal of machine learning research, Vol. 13, 2 (2012).Google Scholar
Kaveh Hassani and Amir Hosein Khasahmadi. 2020. Contrastive multi-view representation learning on graphs. In International conference on machine learning. PMLR, 4116--4126.Google Scholar
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9729--9738.Google ScholarCross Ref
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarCross Ref
R Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Phil Bachman, Adam Trischler, and Yoshua Bengio. 2018. Learning deep representations by mutual information estimation and maximization. arXiv preprint arXiv:1808.06670 (2018).Google Scholar
Thomas N Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In ICLR.Google Scholar
Devin Kreuzer, Dominique Beaini, Will Hamilton, Vincent Létourneau, and Prudencio Tossou. 2021. Rethinking graph transformers with spectral attention. Advances in Neural Information Processing Systems , Vol. 34 (2021), 21618--21629.Google Scholar
Huidong Liang, Xingjian Du, Bilei Zhu, Zejun Ma, Ke Chen, and Junbin Gao. 2023. Graph contrastive learning with implicit augmentations. Neural Networks , Vol. 163 (2023), 156--164.Google ScholarDigital Library
Ralph Linsker. 1988. Self-organization in a perceptual network. Computer, Vol. 21, 3 (1988), 105--117.Google ScholarDigital Library
Yixin Liu, Ming Jin, Shirui Pan, Chuan Zhou, Yu Zheng, Feng Xia, and S Yu Philip. 2022. Graph self-supervised learning: A survey. IEEE Transactions on Knowledge and Data Engineering, Vol. 35, 6 (2022), 5879--5900.Google Scholar
Zemin Liu, Yuan Fang, Chenghao Liu, and Steven CH Hoi. 2021. Node-wise localization of graph neural networks. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence. 1520--1526.Google ScholarCross Ref
Qiheng Mao, Zemin Liu, Chenghao Liu, Zhuo Li, and Jianling Sun. 2024. Advancing Graph Representation Learning with Large Language Models: A Comprehensive Survey of Techniques. arXiv preprint arXiv:2402.05952 (2024).Google Scholar
Qiheng Mao, Zemin Liu, Chenghao Liu, and Jianling Sun. 2023. Hinormer: Representation learning on heterogeneous information networks with graph transformer. In Proceedings of the ACM Web Conference 2023. 599--610.Google ScholarDigital Library
Christopher Morris, Nils M Kriege, Franka Bause, Kristian Kersting, Petra Mutzel, and Marion Neumann. 2020. Tudataset: A collection of benchmark datasets for learning with graphs. arXiv preprint arXiv:2007.08663 (2020).Google Scholar
Annamalai Narayanan, Mahinthan Chandramohan, Rajasekar Venkatesan, Lihui Chen, Yang Liu, and Shantanu Jaiswal. 2017. graph2vec: Learning distributed representations of graphs. arXiv preprint arXiv:1707.05005 (2017).Google Scholar
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: Online learning of social representations. In KDD. 701--710.Google Scholar
Ladislav Rampávs ek, Mikhail Galkin, Vijay Prakash Dwivedi, Anh Tuan Luu, Guy Wolf, and Dominique Beaini. 2022. Recipe for a General, Powerful, Scalable Graph Transformer. arXiv preprint arXiv:2205.12454 (2022).Google Scholar
Yu Rong, Yatao Bian, Tingyang Xu, Weiyang Xie, Ying Wei, Wenbing Huang, and Junzhou Huang. 2020. Self-supervised graph transformer on large-scale molecular data. Advances in Neural Information Processing Systems , Vol. 33 (2020), 12559--12571.Google Scholar
Claude Elwood Shannon. 1948. A mathematical theory of communication. The Bell system technical journal, Vol. 27, 3 (1948), 379--423.Google Scholar
Yucheng Shi, Kaixiong Zhou, and Ninghao Liu. 2023. Engage: Explanation guided data augmentation for graph representation learning. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 104--121.Google ScholarDigital Library
Fan-Yun Sun, Jordan Hoffmann, Vikas Verma, and Jian Tang. 2020. InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization. In ICLR.Google Scholar
Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. LINE: Large-scale information network embedding. In WWW. 1067--1077.Google ScholarDigital Library
Yonglong Tian, Chen Sun, Ben Poole, Dilip Krishnan, Cordelia Schmid, and Phillip Isola. 2020. What makes for good views for contrastive learning? Advances in neural information processing systems , Vol. 33 (2020), 6827--6839.Google Scholar
Jake Topping, Francesco Di Giovanni, Benjamin Paul Chamberlain, Xiaowen Dong, and Michael M Bronstein. 2021. Understanding over-squashing and bottlenecks on graphs via curvature. arXiv preprint arXiv:2111.14522 (2021).Google Scholar
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems , Vol. 30 (2017).Google Scholar
Petar Velivc ković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018a. Graph attention networks. In ICLR.Google Scholar
Petar Velivc ković, William Fedus, William L Hamilton, Pietro Liò, Yoshua Bengio, and R Devon Hjelm. 2018b. Deep graph infomax. arXiv preprint arXiv:1809.10341 (2018).Google Scholar
Lu Wang, Xiaofu Chang, Shuang Li, Yunfei Chu, Hui Li, Wei Zhang, Xiaofeng He, Le Song, Jingren Zhou, and Hongxia Yang. 2021. Tcl: Transformer-based dynamic graph modelling via contrastive learning. arXiv preprint arXiv:2105.07944 (2021).Google Scholar
Zhanghao Wu, Paras Jain, Matthew Wright, Azalia Mirhoseini, Joseph E Gonzalez, and Ion Stoica. 2021. Representing long-range context for graph neural networks with global attention. Advances in Neural Information Processing Systems , Vol. 34 (2021), 13266--13279.Google Scholar
Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020. A comprehensive survey on graph neural networks. IEEE TNNLS, Vol. 32, 1 (2020), 4--24.Google Scholar
Jun Xia, Lirong Wu, Jintao Chen, Bozhen Hu, and Stan Z Li. 2022a. Simgrace: A simple framework for graph contrastive learning without data augmentation. In Proceedings of the ACM Web Conference 2022. 1070--1079.Google ScholarDigital Library
Jun Xia, Yanqiao Zhu, Yuanqi Du, and Stan Z Li. 2022b. A survey of pretraining on graphs: Taxonomy, methods, and applications. arXiv preprint arXiv:2202.07893 (2022).Google Scholar
Dongkuan Xu, Wei Cheng, Dongsheng Luo, Haifeng Chen, and Xiang Zhang. 2021. Infogcl: Information-aware graph contrastive learning. Advances in Neural Information Processing Systems , Vol. 34 (2021), 30414--30425.Google Scholar
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How powerful are graph neural networks?. In ICLR.Google Scholar
Yihang Yin, Qingzhong Wang, Siyu Huang, Haoyi Xiong, and Xiang Zhang. 2022. Autogcl: Automated graph contrastive learning via learnable view generators. In Proceedings of the AAAI conference on artificial intelligence, Vol. 36. 8892--8900.Google ScholarCross Ref
Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng, Guolin Ke, Di He, Yanming Shen, and Tie-Yan Liu. 2021. Do transformers really perform badly for graph representation? Advances in Neural Information Processing Systems , Vol. 34 (2021), 28877--28888.Google Scholar
Yuning You, Tianlong Chen, Yang Shen, and Zhangyang Wang. 2021. Graph contrastive learning automated. In International Conference on Machine Learning. PMLR, 12121--12132.Google Scholar
Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, and Yang Shen. 2020. Graph contrastive learning with augmentations. Advances in neural information processing systems , Vol. 33 (2020), 5812--5823.Google Scholar
Chun-Yang Zhang, Wu-Peng Fang, Hai-Chun Cai, CL Philip Chen, and Yue-Na Lin. 2022. Sparse Graph Transformer With Contrastive Learning. IEEE Transactions on Computational Social Systems (2022).Google Scholar
Yanqiao Zhu, Yichen Xu, Feng Yu, Qiang Liu, Shu Wu, and Liang Wang. 2020. Deep graph contrastive representation learning. arXiv preprint arXiv:2006.04131 (2020).Google Scholar
Yanqiao Zhu, Yichen Xu, Feng Yu, Qiang Liu, Shu Wu, and Liang Wang. 2021. Graph contrastive learning with adaptive augmentation. In Proceedings of the Web Conference 2021. 2069--2080. ioGoogle ScholarDigital Library

Index Terms

Full-Attention Driven Graph Contrastive Learning: with Effective Mutual Information Insight
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
    2. Machine learning approaches
      1. Learning latent representations
2. Mathematics of computing
  1. Discrete mathematics
    1. Graph theory
      1. Graph algorithms

Recommendations

Graph Contrastive Learning with Cohesive Subgraph Awareness
WWW '24: Proceedings of the ACM on Web Conference 2024

Graph contrastive learning (GCL) has emerged as a state-of-the-art strategy for learning representations of diverse graphs including social and biomedical networks. GCL widely uses stochastic graph topology augmentation, such as uniform node dropping, to ...
Read More
Cross-view graph contrastive learning with hypergraph
Abstract
Graph contrastive learning (GCL) provides a new perspective to alleviate the reliance on labeled data for graph representation learning. Recent efforts on GCL leverage various graph augmentation strategies, i.e., node dropping and edge masking, ...
Highlights
- We proposed that hypergraphs are used as a paradigm to enhance graph contrastive learning.
- We propose a novel diffusion model-based fusion mechanism that aligns the positive examples.
- Our experimental results all exceed existing ...
Read More
Dual Space Graph Contrastive Learning
WWW '22: Proceedings of the ACM Web Conference 2022

Unsupervised graph representation learning has emerged as a powerful tool to address real-world problems and achieves huge success in the graph learning domain. Graph contrastive learning is one of the unsupervised graph representation learning methods, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '24: Proceedings of the ACM on Web Conference 2024
May 2024
4826 pages
ISBN:9798400701719
DOI:10.1145/3589334
General Chairs:
Tat-Seng Chua
National University of Singapore
,
Chong-Wah Ngo
Singapore Management University
,
Proceedings Chair:
Roy Ka-Wei Lee
Singapore University of Technology and Design
,
Program Chairs:
Ravi Kumar
Google
,
Hady W. Lauw
Singapore Management University
Copyright © 2024 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 May 2024
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
graph contrastive learning
graph transformer
information theory
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 88
  Total Downloads
- Downloads (Last 12 months)88
- Downloads (Last 6 weeks)86
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full-Attention Driven Graph Contrastive Learning: with Effective Mutual Information Insight

WWW '24: Proceedings of the ACM on Web Conference 2024

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Graph Contrastive Learning with Cohesive Subgraph Awareness

Cross-view graph contrastive learning with hypergraph

Dual Space Graph Contrastive Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Full-Attention Driven Graph Contrastive Learning: with Effective Mutual Information Insight

WWW '24: Proceedings of the ACM on Web Conference 2024

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Graph Contrastive Learning with Cohesive Subgraph Awareness

Cross-view graph contrastive learning with hypergraph

Dual Space Graph Contrastive Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media