@inproceedings{DBLP:conf/vldb/AgarwalADGNRS96, author = {Sameet Agarwal and Rakesh Agrawal and Prasad Deshpande and Ashish Gupta and Jeffrey F. Naughton and Raghu Ramakrishnan and Sunita Sarawagi}, editor = {T. M. Vijayaraman and Alejandro P. Buchmann and C. Mohan and Nandlal L. Sarda}, title = {On the Computation of Multidimensional Aggregates}, booktitle = {VLDB'96, Proceedings of 22th International Conference on Very Large Data Bases, September 3-6, 1996, Mumbai (Bombay), India}, publisher = {Morgan Kaufmann}, year = {1996}, isbn = {1-55860-382-4}, pages = {506-521}, ee = {db/conf/vldb/AgarwalADGNRS96.html}, crossref = {DBLP:conf/vldb/96}, bibsource = {DBLP, http://dblp.uni-trier.de} }

At the heart of all OLAP or multidimensional
data analysis is the ability to simultaneously
aggregate across many sets of dimensions.
Computing multidimensional aggregates is a performance
bottleneck for these applications.
This paper presents fast algorithms for computing a
collection of group-bys.
We focus on a special case of the aggregation problem -
computation of the **CUBE** operator.
The **CUBE** operator requires computing group-bys on all possible
combinations of a list of attributes, and is equivalent to the union
of a number of standard group-by operations.
We show how the structure of **CUBE** computations
can be viewed in terms of a hierarchy of group-by operations.
Our algorithms extend sort-based and hash-based grouping methods
with several optimizations, like combining common operations across
multiple group-bys, caching, and using pre-computed group-bys for
computing other group-bys.
Empirical evaluation shows that the resulting algorithms
give much better performance compared to straightforward methods.

This paper combines work done concurrently on computing the data cube by two different teams as reported in [SAG96] and [DANR96].

Contents

