go back

Volume 18, No. 4

Efficient Graph Embedding Generation and Update for Large-Scale Temporal Graph

Authors:
Yifan Song, Xiaolong Chen, Wenqing Lin, Jia Li, Chen Zhang, Yan Zhou, Lei Chen, Jing Tang

Abstract

Graph embedding aims at mapping each node to a low-dimensional vector, beneficial for various applications like pattern matching, retrieval augmented generation and recommendation. In this paper, we study the large-scale temporal graph embedding problem. Different from simple graphs, each edge has a timestamp in temporal graphs, which requires the embeddings to encode the temporal biases. Factorizing similarity matrix is a common approach for generating simple graph embeddings where similarity can be well characterized by some conventional metrics like personalized PageRank. However, how to construct a similarity that can encode interactions with temporal biases is a critical problem for large scale temporal graphs. To address this, we introduce the concept of temporal-based bipartite graph (TBG) and develop the temporal preferential attachment similarity (TPASim) that reflects concurrent node activity over time. Directly factorizing the TPASim matrix, which contains nearly n2 non-zeros, is not feasible for large graphs with n nodes. Instead, we present LTGE , which constructs and factorizes a temporal matrix with at most 2m non-zeros, where m is the number of edges. Our theoretical analysis shows that LTGE achieves the same embeddings as factorizing the TPASim matrix but significantly reduces complexity by a factor of n2/m. On the other hand, when graphs evolve over time, to avoid recomputing, we further propose LTGEInc that utilizes a novel incremental singular value decomposition (SVD) algorithm with provable guarantee for updating the embeddings. Extensive experiments on several datasets with up to 17 million nodes and 1.3 billion edges demonstrate that LTGE outperforms the state of the art significantly and is orders of magnitude faster than the baselines specially designed for temporal graphs. For embeddings update, LTGEInc retains the performance with small computational overhead.

PVLDB is part of the VLDB Endowment Inc.

Privacy Policy