research-article

On the Trade-off between Over-smoothing and Over-squashing in Deep Graph Neural Networks

Authors:
Jhony H. Giraldo

LTCI, Télécom Paris - Institut Polytechnique de Paris, Palaiseau, France

LTCI, Télécom Paris - Institut Polytechnique de Paris, Palaiseau, France

0000-0002-0039-1270
View Profile

,
Konstantinos Skianis

BLUAI, Athens, Greece

BLUAI, Athens, Greece

0009-0002-8804-6320
View Profile

,
Thierry Bouwmans

Laboratoire MIA, La Rochelle Université, La Rochelle, France

Laboratoire MIA, La Rochelle Université, La Rochelle, France

0000-0003-4018-8856
View Profile

,
Fragkiskos D. Malliaros

Université Paris-Saclay, CentraleSupélec, Inria, Gif-sur-Yvette, France

Université Paris-Saclay, CentraleSupélec, Inria, Gif-sur-Yvette, France

0000-0002-8770-3969
View Profile

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge ManagementOctober 2023Pages 566–576https://doi.org/10.1145/3583780.3614997

Published:21 October 2023Publication History

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

Pages 566–576

ABSTRACT

Graph Neural Networks (GNNs) have succeeded in various computer science applications, yet deep GNNs underperform their shallow counterparts despite deep learning's success in other domains. Over-smoothing and over-squashing are key challenges when stacking graph convolutional layers, hindering deep representation learning and information propagation from distant nodes. Our work reveals that over-smoothing and over-squashing are intrinsically related to the spectral gap of the graph Laplacian, resulting in an inevitable trade-off between these two issues, as they cannot be alleviated simultaneously. To achieve a suitable compromise, we propose adding and removing edges as a viable approach. We introduce the Stochastic Jost and Liu Curvature Rewiring (SJLR) algorithm, which is computationally efficient and preserves fundamental properties compared to previous curvature-based methods. Unlike existing approaches, SJLR performs edge addition and removal during GNN training while maintaining the graph unchanged during testing. Comprehensive comparisons demonstrate SJLR's competitive performance in addressing over-smoothing and over-squashing.

Supplemental Material

<ID468>-video.mp4

mp4

40.6 MB

Download

References

Uri Alon and Eran Yahav. 2021. On the bottleneck of graph neural networks and its practical implications. In International Conference on Learning Representations.Google Scholar
Adrien Benamira, Benjamin Devillers, Etienne Lesot, Ayush K. Ray, Manal Saadi, and Fragkiskos D. Malliaros. 2019. Semi-Supervised Learning and Graph Neural Networks for Fake News Detection. In International Conference on Advances in Social Networks Analysis and Mining.Google Scholar
Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann LeCun. 2014. Spectral networks and locally connected networks on graphs. In International Conference on Learning Representations.Google Scholar
Deng Cai and Wai Lam. 2020. Graph transformer for graph-to-sequence learning. In AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
Shaofei Cai, Liang Li, Jincan Deng, Beichen Zhang, Zheng-Jun Zha, Li Su, and Qingming Huang. 2021. Rethinking graph neural architecture search from message-passing. In IEEE/CVF Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Jeff Cheeger. 1970. A lower bound for the smallest eigenvalue of the Laplacian. Problems in Analysis, Vol. 625, 195--199 (1970), 110.Google Scholar
Chaoqi Chen, Yushuang Wu, Qiyuan Dai, Hong-Yu Zhou, Mutian Xu, Sibei Yang, Xiaoguang Han, and Yizhou Yu. 2022. A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-Oriented Perspective. arXiv preprint arXiv:2209.13232 (2022).Google Scholar
Deli Chen, Yankai Lin, Wei Li, Peng Li, Jie Zhou, and Xu Sun. 2020a. Measuring and Relieving the Over-Smoothing Problem for Graph Neural Networks from the Topological View. In AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
Yu Chen, Lingfei Wu, and Mohammed Zaki. 2020b. Iterative deep graph learning for graph neural networks: Better and robust node embeddings. In Advances in Neural Information Processing Systems.Google Scholar
Eli Chien, Jianhao Peng, Pan Li, and Olgica Milenkovic. 2021. Adaptive universal generalized PageRank graph neural network. In International Conference on Learning Representations.Google Scholar
Fan R. K. Chung. 1997. Spectral graph theory. Number 92. American Mathematical Soc.Google Scholar
Francesco Di Giovanni, Lorenzo Giusti, Federico Barbero, Giulia Luise, Pietro Lio, and Michael Bronstein. 2023. On Over-Squashing in Message Passing Neural Networks: The Impact of Width, Depth, and Topology. arXiv preprint arXiv:2302.02941 (2023).Google Scholar
Alexandre Duval, Victor Schmidt, Alex Hernández-Garc'ia, Santiago Miret, Fragkiskos D. Malliaros, Yoshua Bengio, and David Rolnick. 2023. FAENet: Frame Averaging Equivariant GNN for Materials Modeling. In International Conference on Machine Learning.Google Scholar
Vijay Prakash Dwivedi, Ladislav Rampávs ek, Mikhail Galkin, Ali Parviz, Guy Wolf, Anh Tuan Luu, and Dominique Beaini. 2022. Long range graph benchmark. In Advances in Neural Information Processing Systems.Google Scholar
Matthias Fey and Jan Eric Lenssen. 2019. Fast Graph Representation Learning with PyTorch Geometric. In International Conference on Learning Representations - Workshops.Google Scholar
Pablo Gainza, Freyr Sverrisson, Frederico Monti, Emanuele Rodola, D Boscaini, Michael Bronstein, and BE Correia. 2020. Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning. Nature Methods, Vol. 17, 2 (2020), 184--192.Google ScholarCross Ref
Michael R Garey, David S Johnson, and Larry Stockmeyer. 1974. Some simplified NP-complete problems. In ACM Symposium on Theory of Computing.Google ScholarDigital Library
Johannes Gasteiger, Aleksandar Bojchevski, and Stephan Günnemann. 2019a. Predict then propagate: Graph neural networks meet personalized PageRank. In International Conference on Learning Representations.Google Scholar
Johannes Gasteiger, Stefan Weißenberger, and Stephan Günnemann. 2019b. Diffusion improves graph learning. In Advances in Neural Information Processing Systems.Google Scholar
Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. 2017. Neural message passing for quantum chemistry. In International Conference on Machine Learning.Google Scholar
Jhony H Giraldo, Sajid Javed, Naoufel Werghi, and Thierry Bouwmans. 2021. Graph CNN for moving object detection in complex environments from unseen videos. In IEEE/CVF International Conference on Computer Vision.Google ScholarCross Ref
R. Hamilton. 1998. The Ricci flow on surfaces. Mathematics and General Relativity, Vol. 71 (1998), 237--262.Google ScholarCross Ref
Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems.Google Scholar
Kai Han, Yunhe Wang, Jianyuan Guo, Yehui Tang, and Enhua Wu. 2022. Vision GNN: An image is worth graph of nodes. In Advances in Neural Information Processing Systems.Google Scholar
Wei Huang, Yayong Li, Weitao Du, Jie Yin, Richard Yi Da Xu, Ling Chen, and Miao Zhang. 2022. Towards Deepening Graph Neural Networks: A GNTK-based Optimization Perspective. In International Conference on Learning Representations.Google Scholar
Jürgen Jost and Shiping Liu. 2014. Ollivier's Ricci curvature, local clustering and curvature-dimension inequalities on graphs. Discrete & Computational Geometry, Vol. 51, 2 (2014), 300--322.Google ScholarDigital Library
Kedar Karhadkar, Pradeep Kr Banerjee, and Guido Montúfar. 2023. FoSR: First-order spectral rewiring for addressing oversquashing in GNNs. In International Conference on Learning Representations.Google Scholar
Diederik P Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In International Conference on Learning Representations.Google Scholar
Thomas N Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations.Google Scholar
Devin Kreuzer, Dominique Beaini, Will Hamilton, Vincent Létourneau, and Prudencio Tossou. 2021. Rethinking graph transformers with spectral attention. In Advances in Neural Information Processing Systems.Google Scholar
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature, Vol. 521, 7553 (2015), 436--444.Google Scholar
Guohao Li, Matthias Muller, Ali Thabet, and Bernard Ghanem. 2019. DeepGCNs: Can GCNs go as deep as CNNs?. In IEEE/CVF International Conference on Computer Vision.Google ScholarCross Ref
Qimai Li, Zhichao Han, and Xiao-Ming Wu. 2018. Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning. In AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
Yong Lin, Linyuan Lu, and Shing-Tung Yau. 2011. Ricci curvature of graphs. Tohoku Mathematical Journal, Second Series, Vol. 63, 4 (2011), 605--627.Google ScholarCross Ref
Yang Liu, Chuan Zhou, Shirui Pan, Jia Wu, Zhao Li, Hongyang Chen, and Peng Zhang. 2023. CurvDrop: A Ricci Curvature Based Approach to Prevent Graph Neural Networks from Over-Smoothing and Over-Squashing. In ACM Web Conference.Google ScholarDigital Library
Andrew Kachites McCallum, Kamal Nigam, Jason Rennie, and Kristie Seymore. 2000. Automating the construction of internet portals with machine learning. Information Retrieval, Vol. 3, 2 (2000), 127--163.Google ScholarDigital Library
Galileo Namata, Ben London, Lise Getoor, Bert Huang, and U Edu. 2012. Query-driven active surveying for collective classification. In International Workshop on Mining and Learning with Graphs.Google Scholar
Yann Ollivier. 2009. Ricci curvature of Markov chains on metric spaces. Journal of Functional Analysis, Vol. 256, 3 (2009), 810--864.Google ScholarCross Ref
Kenta Oono and Taiji Suzuki. 2020. Graph neural networks exponentially lose expressive power for node classification. In International Conference on Learning Representations.Google Scholar
Hongbin Pei, Bingzhe Wei, Kevin Chen-Chuan Chang, Yu Lei, and Bo Yang. 2020. Geom-GCN: Geometric graph convolutional networks. In International Conference on Learning Representations.Google Scholar
Wieke Prummel, Jhony H Giraldo, Anastasia Zakharova, and Thierry Bouwmans. 2023. Inductive Graph Neural Networks for Moving Object Segmentation. In IEEE International Conference on Image Processing.Google Scholar
Yu Rong, Wenbing Huang, Tingyang Xu, and Junzhou Huang. 2020. DropEdge: Towards deep graph convolutional networks on node classification. In International Conference on Learning Representations.Google Scholar
Benedek Rozemberczki, Carl Allen, and Rik Sarkar. 2021. Multi-scale attributed node embedding. Journal of Complex Networks, Vol. 9, 2 (2021), 1--22.Google ScholarCross Ref
Aliaksei Sandryhaila and Jose M. F. Moura. 2014. Discrete signal processing on graphs: Frequency analysis. IEEE Transactions on Signal Processing, Vol. 62, 12 (2014), 3042--3054.Google ScholarDigital Library
Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Galligher, and Tina Eliassi-Rad. 2008. Collective classification in network data. AI magazine, Vol. 29, 3 (2008), 93--93.Google Scholar
Alistair Sinclair. 2012. Algorithms for random generation and counting: a Markov chain approach. Springer Science & Business Media.Google Scholar
Jie Tang, Jimeng Sun, Chi Wang, and Zi Yang. 2009. Social influence analysis in large-scale networks. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google ScholarDigital Library
Jake Topping, Francesco Di Giovanni, Benjamin Paul Chamberlain, Xiaowen Dong, and Michael M Bronstein. 2022. Understanding over-squashing and bottlenecks on graphs via curvature. In International Conference on Learning Representations.Google Scholar
Werner Uwents, Gabriele Monfardini, Hendrik Blockeel, Marco Gori, and Franco Scarselli. 2011. Neural networks for relational learning: An experimental comparison. Machine Learning, Vol. 82, 3 (2011), 315--349.Google ScholarDigital Library
Petar Velivc ković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018. Graph Attention Networks. In International Conference on Learning Representations.Google Scholar
Felix Wu, Amauri Souza, Tianyi Zhang, Christopher Fifty, Tao Yu, and Kilian Weinberger. 2019. Simplifying graph convolutional networks. In International Conference on Machine Learning.Google Scholar
Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020. A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems (2020).Google ScholarCross Ref
Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng, Guolin Ke, Di He, Yanming Shen, and Tie-Yan Liu. 2021. Do transformers really perform badly for graph representation?. In Advances in Neural Information Processing Systems.Google Scholar
Seongjun Yun, Minbyul Jeong, Raehyun Kim, Jaewoo Kang, and Hyunwoo J Kim. 2019. Graph transformer networks. In Advances in neural information processing systems.Google Scholar
Hanqing Zeng, Muhan Zhang, Yinglong Xia, Ajitesh Srivastava, Andrey Malevich, Rajgopal Kannan, Viktor Prasanna, Long Jin, and Ren Chen. 2021. Decoupling the Depth and Scope of Graph Neural Networks. In Advances in Neural Information Processing Systems.Google Scholar
Lingxiao Zhao and Leman Akoglu. 2020. PairNorm: Tackling oversmoothing in GNNs. In International Conference on Learning Representations.Google Scholar
Kaixiong Zhou, Xiao Huang, Yuening Li, Daochen Zha, Rui Chen, and Xia Hu. 2020. Towards deeper graph neural networks with differentiable group normalization. In Advances in Neural Information Processing Systems.Google Scholar
Hao Zhu and Piotr Koniusz. 2021. Simple spectral graph convolution. In International Conference on Learning Representations.Google Scholar
Marinka Zitnik and Jure Leskovec. 2017. Predicting multicellular function through multi-layer tissue networks. Bioinformatics, Vol. 33, 14 (2017), i190--i198.Google ScholarCross Ref

Index Terms

On the Trade-off between Over-smoothing and Over-squashing in Deep Graph Neural Networks
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Neural networks
2. Computing methodologies
  1. Machine learning
    1. Machine learning algorithms

Recommendations

CurvDrop: A Ricci Curvature Based Approach to Prevent Graph Neural Networks from Over-Smoothing and Over-Squashing
WWW '23: Proceedings of the ACM Web Conference 2023

Graph neural networks (GNNs) are powerful models to handle graph data and can achieve state-of-the-art in many critical tasks including node classification and link prediction. However, existing graph neural networks still face both challenges of over-...
Read More
Alleviating over-smoothing via graph sparsification based on vertex feature similarity
Abstract
In recent years, graph neural networks (GNNs) have developed rapidly. However, GNNs are difficult to deepen because of over-smoothing. This limits their applications. Starting from the relationship between graph sparsification and over-smoothing, ...
Read More
Understanding Dropout for Graph Neural Networks
WWW '22: Companion Proceedings of the Web Conference 2022

Graph neural network (GNN) has demonstrated superior performance on graph learning tasks. GNN captures the data dependencies via message passing amid neural networks. Hence the prediction of a node label can utilize information from its neighbors in a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
October 2023
5508 pages
ISBN:9798400701245
DOI:10.1145/3583780
General Chairs:
Ingo Frommholz
University of Wolverhampton, UK
,
Frank Hopfgartner
University of Koblenz, Germany
,
Mark Lee
University of Birmingham, UK
,
Michael Oakes
University of Birmingham, UK
,
Program Chairs:
Mounia Lalmas
Spotify, UK
,
Min Zhang
Tsinghua University, China
,
Rodrygo Santos
Federal University of Minas Gerais, Brazil
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 October 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
curvature
graph neural networks
over-smoothing
over-squashing
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 239
  Total Downloads
- Downloads (Last 12 months)239
- Downloads (Last 6 weeks)42
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

On the Trade-off between Over-smoothing and Over-squashing in Deep Graph Neural Networks

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

CurvDrop: A Ricci Curvature Based Approach to Prevent Graph Neural Networks from Over-Smoothing and Over-Squashing

Alleviating over-smoothing via graph sparsification based on vertex feature similarity

Understanding Dropout for Graph Neural Networks