ACM SIGMOD Anthology VLDB dblp.uni-trier.de

On the Computation of Multidimensional Aggregates.

Sameet Agarwal, Rakesh Agrawal, Prasad Deshpande, Ashish Gupta, Jeffrey F. Naughton, Raghu Ramakrishnan, Sunita Sarawagi: On the Computation of Multidimensional Aggregates. VLDB 1996: 506-521
@inproceedings{DBLP:conf/vldb/AgarwalADGNRS96,
  author    = {Sameet Agarwal and
               Rakesh Agrawal and
               Prasad Deshpande and
               Ashish Gupta and
               Jeffrey F. Naughton and
               Raghu Ramakrishnan and
               Sunita Sarawagi},
  editor    = {T. M. Vijayaraman and
               Alejandro P. Buchmann and
               C. Mohan and
               Nandlal L. Sarda},
  title     = {On the Computation of Multidimensional Aggregates},
  booktitle = {VLDB'96, Proceedings of 22th International Conference on Very
               Large Data Bases, September 3-6, 1996, Mumbai (Bombay), India},
  publisher = {Morgan Kaufmann},
  year      = {1996},
  isbn      = {1-55860-382-4},
  pages     = {506-521},
  ee        = {db/conf/vldb/AgarwalADGNRS96.html},
  crossref  = {DBLP:conf/vldb/96},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

At the heart of all OLAP or multidimensional data analysis is the ability to simultaneously aggregate across many sets of dimensions. Computing multidimensional aggregates is a performance bottleneck for these applications. This paper presents fast algorithms for computing a collection of group-bys. We focus on a special case of the aggregation problem - computation of the CUBE operator. The CUBE operator requires computing group-bys on all possible combinations of a list of attributes, and is equivalent to the union of a number of standard group-by operations. We show how the structure of CUBE computations can be viewed in terms of a hierarchy of group-by operations. Our algorithms extend sort-based and hash-based grouping methods with several optimizations, like combining common operations across multiple group-bys, caching, and using pre-computed group-bys for computing other group-bys. Empirical evaluation shows that the resulting algorithms give much better performance compared to straightforward methods.

This paper combines work done concurrently on computing the data cube by two different teams as reported in [SAG96] and [DANR96].

Copyright © 1996 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.


Online Paper

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...

Printed Edition

T. M. Vijayaraman, Alejandro P. Buchmann, C. Mohan, Nandlal L. Sarda (Eds.): VLDB'96, Proceedings of 22th International Conference on Very Large Data Bases, September 3-6, 1996, Mumbai (Bombay), India. Morgan Kaufmann 1996, ISBN 1-55860-382-4
Contents CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

Electronic Edition

References

[CM89]
Meng Chang Chen, Lawrence McNamee: On the Data Model and Access Method of Summary Data Management. IEEE Trans. Knowl. Data Eng. 1(4): 519-529(1989) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[GBLP96]
Jim Gray, Adam Bosworth, Andrew Layman, Hamid Pirahesh: Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total. ICDE 1996: 152-159 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[GJ79]
M. R. Garey, David S. Johnson: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman 1979, ISBN 0-7167-1044-7
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[GLS94]
Goetz Graefe, Ann Linville, Leonard D. Shapiro: Sort versus Hash Revisited. IEEE Trans. Knowl. Data Eng. 6(6): 934-944(1994) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Gra93]
Goetz Graefe: Query Evaluation Techniques for Large Databases. ACM Comput. Surv. 25(2): 73-170(1993) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[HNSS95]
Peter J. Haas, Jeffrey F. Naughton, S. Seshadri, Lynne Stokes: Sampling-Based Estimation of the Number of Distinct Values of an Attribute. VLDB 1995: 311-322 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[JS96]
...
[PS82]
Christos H. Papadimitriou, Kenneth Steiglitz: Combinatorial Optimization: Algorithms and Complexity. Prentice-Hall 1982, ISBN 0-13-152462-3
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[FELL57]
...
[HRU96]
Venky Harinarayan, Anand Rajaraman, Jeffrey D. Ullman: Implementing Data Cubes Efficiently. SIGMOD Conference 1996: 205-216 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[GHRU96]
...
[EPST79]
...
[SN95]
Ambuj Shatdal, Jeffrey F. Naughton: Adaptive Parallel Aggregation Algorithms. SIGMOD Conference 1995: 104-114 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[SDNR96]
Amit Shukla, Prasad Deshpande, Jeffrey F. Naughton, Karthikeyan Ramasamy: Storage Estimation for Multidimensional Aggregates in the Presence of Hierarchies. VLDB 1996: 522-531 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[SAG96]
...
[DANR96]
...
[Mic92]
Zbigniew Michalewicz: Statistical and Scientific Databases. Ellis Horwood 1992
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[NC95]
...
[CODD93]
...
[FINK]
...
[Sho82]
Arie Shoshani: Statistical Databases: Characteristics, Problems, and some Solutions. VLDB 1982: 208-222 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[SR96]
...
[STG95]
...
[STL89]
Jaideep Srivastava, Jack S. Eddy Tan, Vincent Y. Lum: TBSAM: An Access Method for Efficient Processing of Statistical Queries. IEEE Trans. Knowl. Data Eng. 1(4): 414-423(1989) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[WELD95]
...

Copyright © Fri Mar 12 17:22:55 2010 by Michael Ley (ley@uni-trier.de)