Histogram-Based Approximation of Set-Valued Query-Answers.
Yannis E. Ioannidis, Viswanath Poosala:
Histogram-Based Approximation of Set-Valued Query-Answers.
VLDB 1999: 174-185@inproceedings{DBLP:conf/vldb/IoannidisP99,
author = {Yannis E. Ioannidis and
Viswanath Poosala},
editor = {Malcolm P. Atkinson and
Maria E. Orlowska and
Patrick Valduriez and
Stanley B. Zdonik and
Michael L. Brodie},
title = {Histogram-Based Approximation of Set-Valued Query-Answers},
booktitle = {VLDB'99, Proceedings of 25th International Conference on Very
Large Data Bases, September 7-10, 1999, Edinburgh, Scotland,
UK},
publisher = {Morgan Kaufmann},
year = {1999},
isbn = {1-55860-615-7},
pages = {174-185},
ee = {db/conf/vldb/IoannidisP99.html},
crossref = {DBLP:conf/vldb/99},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Abstract
Answering queries approximately has recently
been proposed as a way to reduce query response
times in on-line decision support systems, when
the precise answer is not necessary or early
feedback is helpful. Most of the work in this
area uses sampling-based techniques and handles
aggregate queries, ignoring queries that return
relations as answers. In this paper, we extend the
scope of approximate query answering to general queries.
We propose a novel and intuitive error measure for quantifying
the error in an approximate query answer, which can be a
multiset in general. We also study the use of histograms
in approximate query answering as an alternative to sampling.
In that direction, we develop a histogram algebra and
demonstrate how complex SQL queries on a database may
be translated into algebraic operations on the corresponding
histograms. Finally, we present the results of an initial
set of experiments where various types of histograms and
sampling are compared with respect to their effectiveness
in approximate query answering as captured by the introduced
error measure. The results indicate that the MaxDiff(V,A)
histograms provide quality approximations for both set-valued and
aggregate queries, while sampling is competitive mainly for
aggregate queries with no join operators.
Copyright © 1999 by the VLDB Endowment.
Permission to copy without fee all or part of this material is granted provided that the copies are not made or
distributed for direct commercial advantage, the VLDB
copyright notice and the title of the publication and
its date appear, and notice is given that copying
is by the permission of the Very Large Data Base
Endowment. To copy otherwise, or to republish, requires
a fee and/or special permission from the Endowment.
Online Paper
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
Printed Edition
Malcolm P. Atkinson, Maria E. Orlowska, Patrick Valduriez, Stanley B. Zdonik, Michael L. Brodie (Eds.):
VLDB'99, Proceedings of 25th International Conference on Very Large Data Bases, September 7-10, 1999, Edinburgh, Scotland, UK.
Morgan Kaufmann 1999, ISBN 1-55860-615-7
Contents
References
- [1]
- Swarup Acharya, Phillip B. Gibbons, Viswanath Poosala, Sridhar Ramaswamy:
Join Synopses for Approximate Query Answering.
SIGMOD Conference 1999: 275-286
- [2]
- Peter Buneman, Susan B. Davidson, Aaron Watters:
A Semantics for Complex Objects and Approximate Queries.
PODS 1988: 305-314
- [3]
- Chung-Min Chen, Nick Roussopoulos:
Adaptive Selectivity Estimation Using Query Feedback.
SIGMOD Conference 1994: 161-172
- [4]
- Phillip B. Gibbons, Yossi Matias:
New Sampling-Based Summary Statistics for Improving Approximate Query Answers.
SIGMOD Conference 1998: 331-342
- [5]
- Phillip B. Gibbons, Yossi Matias, Viswanath Poosala:
Fast Incremental Maintenance of Approximate Histograms.
VLDB 1997: 466-475
- [6]
- Joseph M. Hellerstein, Peter J. Haas, Helen J. Wang:
Online Aggregation.
SIGMOD Conference 1997: 171-182
- [7]
- ...
- [8]
- ...
- [9]
- Robert Kooi:
The Optimization of Queries in Relational Databases.
Ph.D. thesis, Case Western Reserve University 1980
- [10]
- Richard J. Lipton, Jeffrey F. Naughton, Donovan A. Schneider:
Practical Selectivity Estimation through Adaptive Sampling.
SIGMOD Conference 1990: 1-11
- [11]
- Amihai Motro:
Query Generalization: A Method for Interpreting Null Answers.
Expert Database Workshop 1984: 597-616
- [12]
- Gultekin Özsoyoglu, Kaizheng Du, Sujatha Guru Swamy, Wen-Chi Hou:
Processing Real-Time, Non-Aggregate Queries with Time-Constraints in CASE-DB.
ICDE 1992: 410-417
- [13]
- Gregory Piatetsky-Shapiro, Charles Connell:
Accurate Estimation of the Number of Tuples Satisfying a Condition.
SIGMOD Conference 1984: 256-276
- [14]
- Viswanath Poosala, Venkatesh Ganti:
Fast Approximate Answers to Aggregate Queries on a Data Cube.
SSDBM 1999: 24-33
- [15]
- Viswanath Poosala, Venkatesh Ganti:
Fast Approximate Query Answering Using Precomputed Statistics.
ICDE 1999: 252
- [16]
- Viswanath Poosala, Yannis E. Ioannidis:
Selectivity Estimation Without the Attribute Value Independence Assumption.
VLDB 1997: 486-495
- [17]
- Viswanath Poosala, Yannis E. Ioannidis, Peter J. Haas, Eugene J. Shekita:
Improved Histograms for Selectivity Estimation of Range Predicates.
SIGMOD Conference 1996: 294-305
- [18]
- ...
- [19]
- Jeffrey Scott Vitter, Min Wang, Balakrishna R. Iyer:
Data Cube Approximation and Histograms via Wavelets.
CIKM 1998: 96-104
- [20]
- Susan V. Vrbsky, Jane W.-S. Liu:
APPROXIMATE - A Query Processor that Produces Monotonically Improving Approximate Answers.
IEEE Trans. Knowl. Data Eng. 5(6): 1056-1068(1993)
- [21]
- George Kingsley Zipf:
Human Behaviour and the Principle of Least Effort: an Introduction to Human Ecology.
Addison-Wesley 1949
Copyright © Tue Mar 16 02:22:08 2010
by Michael Ley (ley@uni-trier.de)