Not Small Enough? SegPQ: A Learned Approach to Compress Product Quantization Codebooks

Authors:

Qiyu Liu, Yanlin Qi, Siyuan Han, Jingshu Peng, Jin Li, Lei Chen

Download PDF

Abstract

The rapid advancements of generative artificial intelligence (GenAI) have recently led to renewed attention towards approximate nearest neighbor (ANN) search and vector databases (VectorDB). Among various ANN methodologies, vector quantization techniques like product quantization (PQ) are widely used to generate space-efficient representations for large-scale dense vectors. However, the codebooks generated by PQ often reach several gigabytes in size, making them impractical for web-scale, high-dimensional vectors in resource-constrained environments like mobile devices. In this study, we propose SegPQ , a simple yet effective framework for losslessly compressing codebooks generated by any PQ variants, enabling efficient in-memory vector search on devices with limited memory. SegPQ represents the raw PQ codewords as a trained error-bounded piecewise linear approximation model ( 𝜖 PLA) and pre-computed low-bit residuals. We theoretically demonstrate that, with high probability, the number of bits per compressed codeword is 1 . 721 + ⌈ log 2 𝜖 OPT ⌉ , where 𝜖 OPT is the optimal error parameter that can be determined by data characteristics. To accelerate query execution, we further design SIMD-aware query processing algorithms on compressed codebooks to fully exploit the hardware parallelism offered by modern architectures. Extensive experimental studies on real datasets showcase that, for 1 billion vectors, SegPQ reduces PQ codebook memory consumption by up to 4.7× (approx. 851 MB ) while incurring only 3.3% additional query processing overhead caused by decompression.

PVLDB is part of the VLDB Endowment Inc.

Start

Current Submission

All Volumes

Reproducibility

General Information

Volume 18, No. 11

Not Small Enough? SegPQ: A Learned Approach to Compress Product Quantization Codebooks

Abstract