ACM SIGMOD Anthology VLDB dblp.uni-trier.de

Explaining Differences in Multidimensional Aggregates.

Sunita Sarawagi: Explaining Differences in Multidimensional Aggregates. VLDB 1999: 42-53
@inproceedings{DBLP:conf/vldb/Sarawagi99,
  author    = {Sunita Sarawagi},
  editor    = {Malcolm P. Atkinson and
               Maria E. Orlowska and
               Patrick Valduriez and
               Stanley B. Zdonik and
               Michael L. Brodie},
  title     = {Explaining Differences in Multidimensional Aggregates},
  booktitle = {VLDB'99, Proceedings of 25th International Conference on Very
               Large Data Bases, September 7-10, 1999, Edinburgh, Scotland,
               UK},
  publisher = {Morgan Kaufmann},
  year      = {1999},
  isbn      = {1-55860-615-7},
  pages     = {42-53},
  ee        = {db/conf/vldb/Sarawagi99.html},
  crossref  = {DBLP:conf/vldb/99},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

Our goal is to enhance multidimensional database systems with advanced mining primitives. Current Online Analytical Processing (OLAP) products provide a minimal set of basic aggregate operators like sum and average and a set of basic navigational operators like drill-downs and roll-ups. These operators have to be driven entirely by the analyst's intuition. Such ad hoc exploration gets tedious and error-prone as data dimensionality and size increases. In earlier work we presented one such advanced primitive where we premined OLAP data for exceptions, summarized the exceptions at appropriate levels, and used them to lead the analyst to the interesting regions.

In this paper we present a second enhancement: a single operator that lets the analyst get summarized reasons for drops or increases observed at an aggregated level. This eliminates the need to manually drill-down for such reasons. We develop an information theoretic formulation for expressing the reasons that is compact and easy to interpret. We design a dynamic programming algorithm that requires only one pass of the data improving significantly over our initial greedy algorithm that required multiple passes. In addition, the algorithm uses a small amount of memory independent of the data size. This allows easy integration with existing OLAP products. We illustrate with our prototype on the DB2/UDB ROLAP product with the Excel Pivot-table frontend. Experiments on this prototype using the OLAP data benchmark demonstrate (1) scalability of our algorithm as the size and dimensionality of the cube increases and (2) feasibility of getting interactive answers even with modest hardware resources.

Copyright © 1999 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.


Online Paper

DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...

Printed Edition

Malcolm P. Atkinson, Maria E. Orlowska, Patrick Valduriez, Stanley B. Zdonik, Michael L. Brodie (Eds.): VLDB'99, Proceedings of 25th International Conference on Very Large Data Bases, September 7-10, 1999, Edinburgh, Scotland, UK. Morgan Kaufmann 1999, ISBN 1-55860-615-7
Contents CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

References

[Arb]
...
[CD97]
Surajit Chaudhuri, Umeshwar Dayal: An Overview of Data Warehousing and OLAP Technology. SIGMOD Record 26(1): 65-74(1997) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Cod93]
...
[Cor97a]
...
[Car97b]
...
[Cou]
...
[CT91]
...
[Dis]
...
[Ham77]
...
[HF95]
Jiawei Han, Yongjian Fu: Discovery of Multiple-Level Association Rules from Large Databases. VLDB 1995: 420-431 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Lap95]
...
[Mic98a]
...
[Mic98b]
...
[SAM98]
Sunita Sarawagi, Rakesh Agrawal, Nimrod Megiddo: Discovery-Driven Exploration of OLAP Data Cubes. EDBT 1998: 168-182 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Sof]
...

Copyright © Tue Mar 16 02:22:08 2010 by Michael Ley (ley@uni-trier.de)