Elsevier

Knowledge-Based Systems

Volume 260, 25 January 2023, 110120
Knowledge-Based Systems

SMGCL: Semi-supervised Multi-view Graph Contrastive Learning

https://doi.org/10.1016/j.knosys.2022.110120Get rights and content

Abstract

Graph contrastive learning (GCL), aiming to generate supervision information by transforming the graph data itself, is increasingly becoming a focus of graph research. It has shown promising performance in graph representation learning by extracting global-level abstract features of graphs. Nonetheless, most GCL methods are performed in a completely unsupervised manner and would get unappealing results in balancing the multi-view information of graphs. To alleviate this, we propose a Semi-supervised Multi-view Graph Contrastive Learning (SMGCL) framework for graph classification. The framework can capture the comparative relations between label-independent and label-dependent node (or graph) pairs across different views. In particular, we devise a graph neural network (GNN)-based label augmentation module to exploit the label information and guarantee the discrimination of the learned representations. In addition, a shared decoder module is complemented to extract the underlying determinative relationship between learned representations and graph topology. Experimental results on graph classification tasks demonstrate the superiority of the proposed framework.

Introduction

Contrastive learning (CL) is a machine learning technique applied to self-supervised representation learning that learns general data features by pulling positive data pairs together and pushing negative data pairs apart in the embedding space [1]. CL is used extensively in a variety of practical scenarios, such as visual [2], [3] and natural language processing [4], [5], particularly when there are plenty of unlabeled data but not enough labeled ones. Recently, the success of CL on computer vision and language tasks has dramatically inspired the application of CL to graph data, yielding a range of graph contrastive learning (GCL) methods.

GCL has the ability to generate supervision information by transforming the graph data itself, even without labels, which greatly alleviates the problems arising from scarce labeled data. Graph neural networks (GNNs) [6], a popular family of deep learning algorithms operating on graph data, have shown impressive capabilities in graph representation learning. Existing GCL methods generally employ GNNs as encoders and leverage graph augmentation approaches to produce data pairs for contrastive learning. Data pairs are available in the form of (node, node), (node, graph), (graph, subgraph), (graph, graph), and so on. Graph representations can be obtained by maximizing the mutual information (MI) between the potential features of the generated data pairs. Most GCL methods [7], [8], [9] use a completely unsupervised schema for contrastive learning. Although these methods can achieve superior performance by generating supervision information from unlabeled data, they do not effectively utilize scarce yet valuable labeled data, resulting in a lack of discriminative power in the learned representations [10]. CG3 [10] further takes into account the consistency between label-oriented data pairs and accomplishes semi-supervised contrastive learning based on the original graph topology. However, it ignores the fact that multi-view augmentation of graphs would further strengthen the robustness of the contrastive learning model [11].

We summarize the representative GCL methods in Table 1. As can be seen from Table 1, there is a dearth of work that supports both semi-supervised graph contrastive learning and the application of multi-view graph augmentation procedures. The primary challenge is how to properly utilize the limited label information throughout the GCL process while balancing the interactions across multi-view information. This direction deserves special attention, and it is also the focus of the work in this paper.

In this work, we propose a framework for extracting sufficient contrastive supervision information and learning discriminative representations of graph data called Semi-supervised Multi-view Graph Contrastive Learning (SMGCL). The proposed framework conducts semi-supervised contrastive learning from two perspectives. The first is the design of a semi-supervised contrastive learning loss to promote intra-class compactness and inter-class separability. Specifically, there involves an unsupervised loss for maximizing the MI between representations of nodes and their subgraphs and a supervised loss for enhancing the agreement between representations of node pairs with the same label. The second is to fine-tune the contrastive learning procedure using a semi-supervised classification module. This module takes a GNN encoder in conjunction with a fully connected (FC) layer to predict node (or subgraph) labels. A label augmentation strategy is employed in this module to further exploit the prediction outcomes of unlabeled nodes. Furthermore, to extract the underlying determinative relationship between the learned representations and input graph topology, we add a shared decoder module across different views to convert the latent representations to the original graph and use it as a complementary supervision signal for graph contrastive learning. To verify the effectiveness of the proposed SMGCL framework, we carry out comprehensive tests on several real-world datasets. Experimental results on node classification and graph classification tasks demonstrate that the proposed framework outperforms state-of-the-art comparison methods. The main contributions of this work are summarized below:

  • We propose a semi-supervised multi-view graph contrastive learning framework. The framework can integrate information from both topology and feature views and extract the supervision information in a semi-supervised manner.

  • We build a semi-supervised contrastive learning module with graph reconstruction to integrate and balance multiple graph information. In addition, we introduce a semi-supervised GNN with label augmentation module to enhance the discriminative power of the learned graph representations.

  • We evaluate the proposed framework on node classification and graph classification tasks on several real-world datasets and demonstrate the superiority of our framework over state-of-the-art comparative methods.

Section snippets

Graph Neural Networks

Graph Neural Networks (GNNs) are a family of deep neural networks that aim to model the graph data in the non-Euclidean space through a recursive neighborhood aggregation scheme [13]. In recent years, researchers at home and abroad have done enormous research work on GNNs, which have significantly improved the performance of large-scale data-driven tasks, including node classification, graph classification, community detection, etc. Bruna et al. [14] presented the first prominent study on the

The proposed method

In this section, we first formulate the preliminaries about the semi-supervised learning on graphs and then introduce the framework of the proposed SMGCL. For clarity, we summarize the frequently used notations in Table 2.

Experiments

Experiments are performed on several real-world data sets to evaluate the following three aspects:

  • (1)

    Classification ability: Evaluate the discrimination of the learned representations on the node and graph classification tasks.

  • (2)

    Effect of the core components: Several ablation studies are conducted to show the contribution of the core components (supervised contrast learning Lnn, graph decoding Ld, and consistency loss Lcon) to the performance of the entire model.

  • (3)

    Sensitivity of the key parameters:

Conclusion and future work

In this paper, we proposed SMGCL, a semi-supervised multi-view graph contrastive learning framework. The framework allows for the incorporation of multi-view graph information for contrastive learning and the extraction of supervised information in a semi-supervised manner. A semi-supervised contrast learning loss is intended to promote intra-class compactness and inter-class separability, which facilitates the full utilization of labeled and unlabeled data to achieve excellent classification

CRediT authorship contribution statement

Hui Zhou: Conceptualization, Methodology, Software, Writing – original draft, Writing – review & editing. Maoguo Gong: Validation, Resources, Supervision. Shanfeng Wang: Methodology, Visualization, Investigation. Yuan Gao: Data curation, Writing – review & editing. Zhongying Zhao: Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Grant no. 62036006 and no. 62072288), and the Taishan Scholar Program of Shandong Province .

References (58)

  • P. Velickovic, W. Fedus, W.L. Hamilton, P. Liò, Y. Bengio, R.D. Hjelm, Deep Graph Infomax, in: Proceedings of the 7th...
  • F. Sun, J. Hoffmann, V. Verma, J. Tang, InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning...
  • HassaniK. et al.

    Contrastive multi-view representation learning on graphs

  • S. Wan, S. Pan, J. Yang, C. Gong, Contrastive and Generative Graph Convolutional Networks for Graph-based...
  • Y. You, T. Chen, Y. Sui, T. Chen, Z. Wang, Y. Shen, Graph Contrastive Learning with Augmentations, in: Advances in...
  • ZhuY. et al.

    Deep graph contrastive representation learning

    (2020)
  • ZhangZ. et al.

    Deep learning on graphs: A survey

    IEEE Trans. Knowl. Data Eng.

    (2022)
  • J. Bruna, W. Zaremba, A. Szlam, Y. LeCun, Spectral Networks and Locally Connected Networks on Graphs, in: Proceedings...
  • M. Defferrard, X. Bresson, P. Vandergheynst, Convolutional neural networks on graphs with fast localized spectral...
  • T.N. Kipf, M. Welling, Semi-Supervised Classification with Graph Convolutional Networks, in: Proceedings of the 5th...
  • W. Hamilton, Z. Ying, J. Leskovec, Inductive representation learning on large graphs, in: Proceedings of the...
  • P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Liò, Y. Bengio, Graph Attention Networks, in: Proceedings of the...
  • F. Monti, D. Boscaini, J. Masci, E. Rodola, J. Svoboda, M.M. Bronstein, Geometric deep learning on graphs and manifolds...
  • K. Xu, C. Li, Y. Tian, T. Sonobe, K. Kawarabayashi, S. Jegelka, Representation Learning on Graphs with Jumping...
  • J. Li, Y. Rong, H. Cheng, H. Meng, W. Huang, J. Huang, Semi-supervised graph classification: A hierarchical graph...
  • Z. Hao, C. Lu, Z. Huang, H. Wang, Z. Hu, Q. Liu, E. Chen, C. Lee, ASGN: An active semi-supervised graph neural network...
  • W. Ju, J. Yang, M. Qu, W. Song, J. Shen, M. Zhang, KGNN: Harnessing Kernel-based Networks for Semi-supervised Graph...
  • JaiswalA. et al.

    A survey on contrastive self-supervised learning

    Technologies

    (2021)
  • R.D. Hjelm, A. Fedorov, S. Lavoie-Marchildon, K. Grewal, P. Bachman, A. Trischler, Y. Bengio, Learning deep...
  • Cited by (9)

    View all citing articles on Scopus
    View full text