NeutronTask: Scalable and Efficient Multi-GPU GNN Training with Task Parallelism

Authors:

Zhenbo Fu, Xin Ai, Qiange Wang, Yanfeng Zhang, Shizhan Lu, Chaoyi Chen, Chunyu Cao, Hao Yuan, Zhewei Wei, Yu Gu, Yingyou Wen, Ge Yu

Download PDF

Abstract

Graph neural networks (GNNs) have emerged as a promising method for learning from graph data, but large-scale GNN training requires extensive memory and computation resources. To address this, researchers have proposed using multi-GPU processing, which partitions graph data across GPUs for parallel training. However, vertex dependencies in multi-GPU GNN training lead to significant neighbor replications across GPUs, increasing memory consumption. The substantial intermediate data generated during training further exacerbates this issue. Neighbor replication and intermediate data constitute the primary memory consumption in GNN training (i.e., typically accounting for over 80%). In this work, we propose GNN task parallelism for multi-GPU GNN training, which reduces neighbor replication by partitioning training tasks in each layer across different GPUs rather than partitioning the graph structure. This approach only partitions the graph data within individual GPUs, reducing the memory requirements of single tasks while overlapping subgraph computation across different GPUs. Shared neighbor embeddings among different subgraphs can be efficiently reused within a single GPU. Additionally, we employ a task-decoupled GNN training framework, which decouples different training tasks to manage their associated intermediate data independently and release it as early as possible to reduce memory usage. By integrating these techniques, we propose a multi-GPU GNN training system, NeutronTask. Experimental results on a 4 × A5000 GPU server show that NeutronTask effectively supports billion-scale full-graph GNN training. For small graphs where the training data fits into the GPUs, NeutronTask achieves 1.27 × - 5.47 × speedup compared to state-of-the-art GNN systems including NeutronStar and Sancus.

PVLDB is part of the VLDB Endowment Inc.

Start

Current Submission

All Volumes

Reproducibility

General Information

Volume 18, No. 6

NeutronTask: Scalable and Efficient Multi-GPU GNN Training with Task Parallelism

Abstract