An Efficient Subgraph-Inferring Framework for Large-Scale Heterogeneous Graphs

Authors

  • Wei Zhou National Engineering Research Center for Big Data Technology and System Services Computing Technology and System Lab Cluster and Grid Computing Lab School of Computer Science and Technology Huazhong University of Science and Technology, Wuhan, 430074, China
  • Hong Huang National Engineering Research Center for Big Data Technology and System Services Computing Technology and System Lab Cluster and Grid Computing Lab School of Computer Science and Technology Huazhong University of Science and Technology, Wuhan, 430074, China
  • Ruize Shi National Engineering Research Center for Big Data Technology and System Services Computing Technology and System Lab Cluster and Grid Computing Lab School of Computer Science and Technology Huazhong University of Science and Technology, Wuhan, 430074, China
  • Kehan Yin National Engineering Research Center for Big Data Technology and System Services Computing Technology and System Lab Cluster and Grid Computing Lab School of Computer Science and Technology Huazhong University of Science and Technology, Wuhan, 430074, China
  • Hai Jin National Engineering Research Center for Big Data Technology and System Services Computing Technology and System Lab Cluster and Grid Computing Lab School of Computer Science and Technology Huazhong University of Science and Technology, Wuhan, 430074, China

DOI:

https://doi.org/10.1609/aaai.v38i8.28797

Keywords:

DMKM: Graph Mining, Social Network Analysis & Community, ML: Graph-based Machine Learning

Abstract

Heterogeneous Graph Neural Networks (HGNNs) play a vital role in advancing the field of graph representation learning by addressing the complexities arising from diverse data types and interconnected relationships in real-world scenarios. However, traditional HGNNs face challenges when applied to large-scale graphs due to the necessity of training or inferring on the entire graph. As the size of the heterogeneous graphs increases, the time and memory overhead required by these models escalates rapidly, even reaching unacceptable levels. To address this issue, in this paper, we present a novel framework named (SubInfer), which conducts training and inferring on subgraphs instead of the entire graphs, hence efficiently handling large-scale heterogeneous graphs. The proposed framework comprises three main steps: 1) partitioning the heterogeneous graph from multiple perspectives to preserve various semantic information, 2) completing the subgraphs to improve the convergence speed of subgraph training and the performance of subgraph inference, and 3) training and inferring the HGNN model on distributed clusters to further reduce the time overhead. The framework is applicable to the vast majority of HGNN models. Experiments on five benchmark datasets demonstrate that SubInfer effectively optimizes the training and inference phase, delivering comparable performance to traditional HGNN models while significantly reducing time and memory overhead.

Published

2024-03-24

How to Cite

Zhou, W., Huang, H., Shi, R., Yin, K., & Jin, H. (2024). An Efficient Subgraph-Inferring Framework for Large-Scale Heterogeneous Graphs. Proceedings of the AAAI Conference on Artificial Intelligence, 38(8), 9431-9439. https://doi.org/10.1609/aaai.v38i8.28797

Issue

Section

AAAI Technical Track on Data Mining & Knowledge Management