go back

Volume 18, No. 12

Workload Insights From the Snowflake Data Cloud: What Do Production Analytic Queries Really Look Like?

Authors:
Jan Vincent Szlang, Sebastian Breß, Sebastian Cattes, Jonathan Dees, Florian Funke, Max Heimel, Michel Oleynik, Ismail Oukid, Tobias Maltenberger

Abstract

Capturing the characteristics of real-world analytical workloads is challenging yet critical for advancing industry practices and academic research. Historically, obtaining accurate query and data characteristics has been difficult, largely because detailed workload information has often been confined to on-premises database systems. With the rise of cloud-native databases like Snowflake, it has become possible to analyze production query workloads at scale and in greater detail. Leveraging this capability, this study presents a comprehensive analysis of analytics workloads across diverse customers and industries. In particular, we investigate the query characteristics of 667 million queries issued by the most popular BI tools against Snowflake over a two-week period. Based on this dataset, this paper makes two primary contributions: first, we conduct a detailed examination of query properties, with particular attention to filters, joins, aggregations, and other previously underexplored aspects. Second, we uncover unique and practically relevant query patterns that are typically absent from standard database benchmarks.

PVLDB is part of the VLDB Endowment Inc.

Privacy Policy