go back

Volume 18, No. 5

Evaluating Continuous Queries with Inconsistency Annotations

Authors:
Samuele Langhi, Angela Bonifati, Riccardo Tommasini

Abstract

Continuous Queries (CQs) run inde!nitely, processing in!nite data streams and producing continuous outputs. They commonly use window functions to segment streams into !nite chunks for compu- tation. Ensuring data integrity in CQs is challenging, involving, for example, streaming joins for binary constraints. Current methods, like dropping or repairing inconsistent data, can harm throughput and increase latency. This paper proposes a novel approach using provenance-based techniques to map violations in input streams to CQ results with minimal overhead. This ensures continuous data flow and maintains the analytical integrity of CQs. Our study explores the feasibility and effciency of this method, addressing a signi!cant gap in applying provenance techniques to streaming data. While provenance-based techniques have proven effective for static data, their application in streaming contexts remains unexplored. Our solution addresses this gap, achieving a stable throughput across increasingly demanding memory loads wrt to the baselines, spacing between a 10% increase for medium-sized buffers (i.e., the windows), up to 80% for heavier loads. Moreover, results show the minimal impact of annotation (up to 25%) in the total execution runtime, demonstrating the effectiveness of our graph-based approach.

PVLDB is part of the VLDB Endowment Inc.

Privacy Policy