go back

Volume 18, No. 4

GTI: Graph-based Tree Index with Logarithm Updates for Nearest Neighbor Search in High-Dimensional Spaces

Authors:
Ruiyao Ma, Yifan Zhu, Baihua Zheng, Lu Chen, Congcong Ge, Yunjun Gao

Abstract

Nearest neighbor search (NNS) is fundamental for high-dimensional space retrieval and impacts various fields, such as pattern recognition, information retrieval, recommendation systems, and vector database management. Among existing NNS methods, graph-based methods often excel in query accuracy and efficiency. However, these methods face significant challenges, including high construction costs and difficulties with dynamic data updates. Recent efforts have focused on combining graph methods with hashing, quantization, and tree-based approaches to address these issues, but problems with large index sizes and update performance remain unresolved. In response, this paper proposes GTI , a novel, lightweight, and dynamic graph-based tree index for high-dimensional NNS. GTI constructs a tree index built across the entire dataset and employs a lightweight graph index at the level 1 of the tree to significantly reduce graph construction costs. It also features effective data insertion and deletion algorithms that enable logarithmic realtime updates. Additionally, we have developed an effective NNS algorithm for GTI , which not only achieves approximate search performance on par with SOTA graph-based methods but also supports exact NNS. Extensive experiments on six real-world datasets demonstrate that GTI achieves an approximately 10 × improvement in update efficiency compared to SOTA tree-based methods, while achieving search effectiveness comparable to SOTA approximate NNS methods. These results underscore the potential of GTI for effective application in dynamic and evolving scenarios.

PVLDB is part of the VLDB Endowment Inc.

Privacy Policy