Effective and Efficient Distributed Temporal Graph Learning through Hotspot Memory Sharing

Authors:

Longjiao Zhang, Rui Wang, Tongya Zheng, Ziqi Huang, Wenjie Huang, Xinyu Wang, Can Wang, Mingli Song, Sai Wu, Shuibing He

Download PDF

Abstract

Memory-based temporal graph neural network (MTGNN) models are eﬀective for predicting temporal graphs by using node memory and message-passing modules to capture temporal and structural information, respectively. However, distributed training for large graphs presents challenges such as accuracy loss and decreased eﬃciency due to remote features and memory transmission. Despite improvements in MTGNN system optimizations, issues like dynamic load imbalances, communication overhead, and memory staleness persist. To tackle these challenges, we introduce MemShare, a distributed MTGNN system. MemShare introduces a novel shared node memory paradigm that utilizes a small subset of shared nodes across machines and GPUs to reduce distributed communication for memory management. It incorporates techniques like shared nodescentric graph partitioning, shared nodes-aware boundary decay sampling, and shared nodes-targeted synchronous smoothing aggregation. Experiments show that MemShare outperforms existing distributed MTGNN systems in accuracy and training eﬃciency.

PVLDB is part of the VLDB Endowment Inc.

Start

Current Submission

All Volumes

Reproducibility

General Information

Volume 18, No. 9

Effective and Efficient Distributed Temporal Graph Learning through Hotspot Memory Sharing

Abstract