go back

Volume 18, No. 11

DobLIX: A Dual-Objective Learned Index for Log-Structured Merge Trees

Authors:
Alireza Heidari, Amirhossein Ahmadi, Wei Zhang

Abstract

In this paper, we introduce DobLIX , a dual-objective learned index (LI) specifically designed for Log-Structured Merge (LSM) treebased key-value stores. Traditional LIs primarily focus on optimizing index lookups, often overlooking the critical role of data access from storage, which can become a significant performance bottleneck. In LSM-based systems, a considerable portion of the index is stored on disk, making lookups highly dependent on the efficient coordination between in-memory structures and disk-resident data. Poorly optimized access patterns can lead to excessive I/O operations, negatively impacting read latency and overall system performance. DobLIX addresses this by incorporating a second objective, data access optimization, into the LI training process. This dual-objective approach ensures that both index lookup efficiency and data access costs are minimized, leading to significant improvements in read performance while maintaining write efficiency in real-world LSM systems. Additionally, DobLIX features a reinforcement learning agent that dynamically tunes the system parameters, allowing it to adapt to varying workloads in real-time. Experimental results using real-world datasets demonstrate that DobLIX reduces indexing overhead and improves throughput by 1 . 19 × to 2 . 21 × compared to state-of-the-art methods within RocksDB, a widely used LSM-based storage engine.

PVLDB is part of the VLDB Endowment Inc.

Privacy Policy