Simple Random Sampling from Relational Databases.
Frank Olken, Doron Rotem:
VLDB 1986: 160-169@inproceedings{DBLP:conf/vldb/OlkenR86,
author = {Frank Olken and
Doron Rotem},
editor = {Wesley W. Chu and
Georges Gardarin and
Setsuo Ohsuga and
Yahiko Kambayashi},
title = {Simple Random Sampling from Relational Databases},
booktitle = {VLDB'86 Twelfth International Conference on Very Large Data Bases,
August 25-28, 1986, Kyoto, Japan, Proceedings},
publisher = {Morgan Kaufmann},
year = {1986},
isbn = {0-934613-18-4},
pages = {160-169},
ee = {db/conf/vldb/OlkenR86.html},
crossref = {DBLP:conf/vldb/86},
bibsource = {DBLP,}
Sampling is a fundamental operation for the auditing
and statistical analysis of large databases. It is not
well supported in existing relational database management
systems. We discuss how to obtain samples
from the results of relational queries without first performing
the query. Specifically, we examine simple random
sampling from selections, projections, joins,
unions, and intersections. We discuss data structures
and algorithms for sampling, and their performance.
We show that samples of relational queries can often
be computed for a small fraction of the effort of computing
the entire relational query, i.e., in time proportional
to sample size, rather than time proportional to
the size of the full result of the relational query.
