SMGCL: Semi-supervised Multi-view Graph Contrastive Learning
Introduction
Contrastive learning (CL) is a machine learning technique applied to self-supervised representation learning that learns general data features by pulling positive data pairs together and pushing negative data pairs apart in the embedding space [1]. CL is used extensively in a variety of practical scenarios, such as visual [2], [3] and natural language processing [4], [5], particularly when there are plenty of unlabeled data but not enough labeled ones. Recently, the success of CL on computer vision and language tasks has dramatically inspired the application of CL to graph data, yielding a range of graph contrastive learning (GCL) methods.
GCL has the ability to generate supervision information by transforming the graph data itself, even without labels, which greatly alleviates the problems arising from scarce labeled data. Graph neural networks (GNNs) [6], a popular family of deep learning algorithms operating on graph data, have shown impressive capabilities in graph representation learning. Existing GCL methods generally employ GNNs as encoders and leverage graph augmentation approaches to produce data pairs for contrastive learning. Data pairs are available in the form of (node, node), (node, graph), (graph, subgraph), (graph, graph), and so on. Graph representations can be obtained by maximizing the mutual information (MI) between the potential features of the generated data pairs. Most GCL methods [7], [8], [9] use a completely unsupervised schema for contrastive learning. Although these methods can achieve superior performance by generating supervision information from unlabeled data, they do not effectively utilize scarce yet valuable labeled data, resulting in a lack of discriminative power in the learned representations [10]. [10] further takes into account the consistency between label-oriented data pairs and accomplishes semi-supervised contrastive learning based on the original graph topology. However, it ignores the fact that multi-view augmentation of graphs would further strengthen the robustness of the contrastive learning model [11].
We summarize the representative GCL methods in Table 1. As can be seen from Table 1, there is a dearth of work that supports both semi-supervised graph contrastive learning and the application of multi-view graph augmentation procedures. The primary challenge is how to properly utilize the limited label information throughout the GCL process while balancing the interactions across multi-view information. This direction deserves special attention, and it is also the focus of the work in this paper.
In this work, we propose a framework for extracting sufficient contrastive supervision information and learning discriminative representations of graph data called Semi-supervised Multi-view Graph Contrastive Learning (SMGCL). The proposed framework conducts semi-supervised contrastive learning from two perspectives. The first is the design of a semi-supervised contrastive learning loss to promote intra-class compactness and inter-class separability. Specifically, there involves an unsupervised loss for maximizing the MI between representations of nodes and their subgraphs and a supervised loss for enhancing the agreement between representations of node pairs with the same label. The second is to fine-tune the contrastive learning procedure using a semi-supervised classification module. This module takes a GNN encoder in conjunction with a fully connected (FC) layer to predict node (or subgraph) labels. A label augmentation strategy is employed in this module to further exploit the prediction outcomes of unlabeled nodes. Furthermore, to extract the underlying determinative relationship between the learned representations and input graph topology, we add a shared decoder module across different views to convert the latent representations to the original graph and use it as a complementary supervision signal for graph contrastive learning. To verify the effectiveness of the proposed SMGCL framework, we carry out comprehensive tests on several real-world datasets. Experimental results on node classification and graph classification tasks demonstrate that the proposed framework outperforms state-of-the-art comparison methods. The main contributions of this work are summarized below:
- •
We propose a semi-supervised multi-view graph contrastive learning framework. The framework can integrate information from both topology and feature views and extract the supervision information in a semi-supervised manner.
- •
We build a semi-supervised contrastive learning module with graph reconstruction to integrate and balance multiple graph information. In addition, we introduce a semi-supervised GNN with label augmentation module to enhance the discriminative power of the learned graph representations.
- •
We evaluate the proposed framework on node classification and graph classification tasks on several real-world datasets and demonstrate the superiority of our framework over state-of-the-art comparative methods.
Section snippets
Graph Neural Networks
Graph Neural Networks (GNNs) are a family of deep neural networks that aim to model the graph data in the non-Euclidean space through a recursive neighborhood aggregation scheme [13]. In recent years, researchers at home and abroad have done enormous research work on GNNs, which have significantly improved the performance of large-scale data-driven tasks, including node classification, graph classification, community detection, etc. Bruna et al. [14] presented the first prominent study on the
The proposed method
In this section, we first formulate the preliminaries about the semi-supervised learning on graphs and then introduce the framework of the proposed SMGCL. For clarity, we summarize the frequently used notations in Table 2.
Experiments
Experiments are performed on several real-world data sets to evaluate the following three aspects:
- (1)
Classification ability: Evaluate the discrimination of the learned representations on the node and graph classification tasks.
- (2)
Effect of the core components: Several ablation studies are conducted to show the contribution of the core components (supervised contrast learning , graph decoding , and consistency loss ) to the performance of the entire model.
- (3)
Sensitivity of the key parameters:
Conclusion and future work
In this paper, we proposed SMGCL, a semi-supervised multi-view graph contrastive learning framework. The framework allows for the incorporation of multi-view graph information for contrastive learning and the extraction of supervised information in a semi-supervised manner. A semi-supervised contrast learning loss is intended to promote intra-class compactness and inter-class separability, which facilitates the full utilization of labeled and unlabeled data to achieve excellent classification
CRediT authorship contribution statement
Hui Zhou: Conceptualization, Methodology, Software, Writing – original draft, Writing – review & editing. Maoguo Gong: Validation, Resources, Supervision. Shanfeng Wang: Methodology, Visualization, Investigation. Yuan Gao: Data curation, Writing – review & editing. Zhongying Zhao: Writing – review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (Grant no. 62036006 and no. 62072288), and the Taishan Scholar Program of Shandong Province .
References (58)
- et al.
GHNN: Graph harmonic neural networks for semi-supervised graph-level classification
Neural Netw.
(2022) - et al.
Bi-CLKT: Bi-graph contrastive learning based knowledge tracing
Knowl.-Based Syst.
(2022) - et al.
Robust cross-network node classification via constrained graph mutual information
Knowl.-Based Syst.
(2022) - et al.
Interpretable and efficient heterogeneous graph convolutional network
IEEE Trans. Knowl. Data Eng.
(2021) - et al.
Supervised contrastive learning
Adv. Neural Inf. Process. Syst.
(2020) - T. Chen, S. Kornblith, M. Norouzi, G.E. Hinton, A Simple Framework for Contrastive Learning of Visual Representations,...
- K. He, H. Fan, Y. Wu, S. Xie, R.B. Girshick, Momentum Contrast for Unsupervised Visual Representation Learning, in:...
- J.M. Giorgi, O. Nitski, B. Wang, G.D. Bader, DeCLUTR: Deep Contrastive Learning for Unsupervised Textual...
- et al.
SimCSE: Simple contrastive learning of sentence embeddings
(2021) - et al.
Self-paced co-training of graph neural networks for semi-supervised node classification
IEEE Trans. Neural Netw. Learn. Syst.
(2022)
Contrastive multi-view representation learning on graphs
Deep graph contrastive representation learning
Deep learning on graphs: A survey
IEEE Trans. Knowl. Data Eng.
A survey on contrastive self-supervised learning
Technologies
Cited by (9)
Graph manifold learning with non-gradient decision layer
2024, NeurocomputingKnowledge distillation-driven semi-supervised multi-view classification
2024, Information FusionAdversarial variational autoencoder for attributed graph embedding with high-frequency noise filtering
2023, Applied Intelligence