![]() |
![]() |
![]() |
@inproceedings{DBLP:conf/vldb/AgarwalADGNRS96,
author = {Sameet Agarwal and
Rakesh Agrawal and
Prasad Deshpande and
Ashish Gupta and
Jeffrey F. Naughton and
Raghu Ramakrishnan and
Sunita Sarawagi},
editor = {T. M. Vijayaraman and
Alejandro P. Buchmann and
C. Mohan and
Nandlal L. Sarda},
title = {On the Computation of Multidimensional Aggregates},
booktitle = {VLDB'96, Proceedings of 22th International Conference on Very
Large Data Bases, September 3-6, 1996, Mumbai (Bombay), India},
publisher = {Morgan Kaufmann},
year = {1996},
isbn = {1-55860-382-4},
pages = {506-521},
ee = {db/conf/vldb/AgarwalADGNRS96.html},
crossref = {DBLP:conf/vldb/96},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
At the heart of all OLAP or multidimensional data analysis is the ability to simultaneously aggregate across many sets of dimensions. Computing multidimensional aggregates is a performance bottleneck for these applications. This paper presents fast algorithms for computing a collection of group-bys. We focus on a special case of the aggregation problem - computation of the CUBE operator. The CUBE operator requires computing group-bys on all possible combinations of a list of attributes, and is equivalent to the union of a number of standard group-by operations. We show how the structure of CUBE computations can be viewed in terms of a hierarchy of group-by operations. Our algorithms extend sort-based and hash-based grouping methods with several optimizations, like combining common operations across multiple group-bys, caching, and using pre-computed group-bys for computing other group-bys. Empirical evaluation shows that the resulting algorithms give much better performance compared to straightforward methods.
This paper combines work done concurrently on computing the data cube by two different teams as reported in [SAG96] and [DANR96].
Copyright © 1996 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.