go back
go back
Volume 18, No. 9
Effective and Efficient Distributed Temporal Graph Learning through Hotspot Memory Sharing
Abstract
Memory-based temporal graph neural network (MTGNN) models are effective for predicting temporal graphs by using node memory and message-passing modules to capture temporal and structural information, respectively. However, distributed training for large graphs presents challenges such as accuracy loss and decreased efficiency due to remote features and memory transmission. Despite improvements in MTGNN system optimizations, issues like dynamic load imbalances, communication overhead, and memory staleness persist. To tackle these challenges, we introduce MemShare, a distributed MTGNN system. MemShare introduces a novel shared node memory paradigm that utilizes a small subset of shared nodes across machines and GPUs to reduce distributed communication for memory management. It incorporates techniques like shared nodescentric graph partitioning, shared nodes-aware boundary decay sampling, and shared nodes-targeted synchronous smoothing aggregation. Experiments show that MemShare outperforms existing distributed MTGNN systems in accuracy and training efficiency.
PVLDB is part of the VLDB Endowment Inc.
Privacy Policy