go back

Volume 18, No. 4

RGS-Sketch: An Accurate, Invertible, and Mergeable Sketch for Online Super Spreader Detection in High-speed Data Streams

Authors:
Boyu Zhang, He Huang, Yu-E Sun, Guoju Gao

Abstract

Super spreader detection in high-speed data streams is crucial for numerous applications. Although many methods have emerged, existing works can hardly concurrently achieve high memory efficiency, support online detection, enable merging data from different measurement points/periods, and offer invertibility. This makes them unable to satisfy flexible application requirements. This paper proposes RGS-Sketch, a novel sketch designed to address this problem. The core of RGS-Sketch lies in a new mergeable memory sharing design called register group sharing. This design organizes registers into groups as basic memory sharing units, accommodating the high skewness of real-world data streams and offering high memory efficiency. Besides, it enables online detection through the real-time acquisition of a group’s state, which also facilitates invertibility. To enhance detection accuracy further, we propose a limited register update strategy. It blocks small flows from updating registers, thereby reducing memory overhead and estimation noises. Extensive experimental results based on four real-world datasets show that RGS-Sketch significantly outperforms the most accurate baselines in accuracy while maintaining a high throughput. Specifically, it improves the F1 scores by up to 0.643 for measurements at a single point/period and up to 0.472 across multiple points/periods.

PVLDB is part of the VLDB Endowment Inc.

Privacy Policy