STsCache: An Efficient Semantic Caching Scheme for Time-series Data Workloads Based on Hybrid Storage

Authors:

Tao Kong, Hui Li, Yuxuan Zhao, Liping Li, Xiyue Gao, Qilong Wu, Jiangtao Cui

Download PDF

Abstract

Due to the increasing demand for extreme-scale time-series data workloads in data centers, it is required to build a high-performance semantic caching system that leverages the semantics and results of historical queries to answer time-series queries. Existing caching solutions either ignore the semantics of queries, offering suboptimal performance, or focus only on specific scenarios, providing smallcapacity, limited functionality. In this paper, we summarize the query patterns of time-series data workload and propose the definition of semantic time-series caching for the first time. Accordingly, we present a semantic timeseries caching system, STsCache, based on a hybrid storage model with memory and NVMe SSD. We propose a series of optimized strategies, such as slab-based semantic data management, semantic index, semantic value-driven batch eviction, time-aware deduplication insertion, and lazy compaction. We implemented and evaluated STsCache via benchmarks and production environments. STsCache can increase throughput of popular time-series databases (InfluxDB, TimescaleDB) by 4.8-10.8 × and reduce latency by 79.9%-93.5%. Compared with the latest time-series caching schemes (TSCache, BSCache), STsCache can increase throughput by 1.5-4.5 × , reduce latency by 59.4%-81.9%, and increase hit ratios by 22.5%-82.4%.

PVLDB is part of the VLDB Endowment Inc.

Start

Current Submission

All Volumes

Reproducibility

General Information

Volume 18, No. 9

STsCache: An Efficient Semantic Caching Scheme for Time-series Data Workloads Based on Hybrid Storage

Abstract