Holistic query Approximation via RL Modeling

Authors:

Susan Davidson, Tova Milo, Kathy Razmadze, Gal Zeevi

Download PDF

Abstract

In data exploration, executing queries over a large database can be time-consuming. Previous work has proposed approximate query processing as a way to speed up aggregate queries in this context, but do not address non-aggregate queries. Our paper introduces a novel holistic approach to handle both types of queries by finding an optimized subset of data, referred to as an approximation set . The goal is to maximize query result quality while using a smaller set of data, thereby significantly reducing the query execution time. We formalize this problem as Holistic Approximate Query Processing and establish its NP-completeness. To tackle this, we propose an approximate solution using Reinforcement Learning , termed HARLM . While HARLM does not provide theoretical guarantees due to its reliance on Reinforcement Learning, it effectively overcomes challenges related to the large action space and the need for generalization beyond a known query workload. Experimental results on both non-aggregate and aggregate benchmarks show that HARLM significantly outperforms the baselines both in terms of accuracy (30% improvement) and efficiency (10-35X).

PVLDB is part of the VLDB Endowment Inc.

Start

Current Submission

All Volumes

Reproducibility

General Information

Volume 18, No. 6

Holistic query Approximation via RL Modeling

Abstract