Random Sampling from B+ Trees.
Frank Olken, Doron Rotem:
Random Sampling from B+ Trees.
VLDB 1989: 269-277@inproceedings{DBLP:conf/vldb/OlkenR89,
author = {Frank Olken and
Doron Rotem},
editor = {Peter M. G. Apers and
Gio Wiederhold},
title = {Random Sampling from B+ Trees},
booktitle = {Proceedings of the Fifteenth International Conference on Very
Large Data Bases, August 22-25, 1989, Amsterdam, The Netherlands},
publisher = {Morgan Kaufmann},
year = {1989},
isbn = {1-55860-101-5},
pages = {269-277},
ee = {db/conf/vldb/OlkenR89.html},
crossref = {DBLP:conf/vldb/89},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Abstract
We consider the design and analysis of algorithms to retrieve simple random samples from databases.
Specifically, we examine simple random sampling from B+ treefiles.
Existing methods of sampling from B+ trees, require the use of auxiliary rank information in the nodes of the tree.
Such modified B+ tree files are called "ranked B+trees".
We compare sampling from ranked B+ tree files, with new acceptance/rejection (A/R) sampling methods which sample directly from standard B+ trees.
Our new A/R sampling algorithm can easily be retrofit to existing DBMSs, and does not require the overhead of maintaining rank information.
We consider both iterative and batch sampling methods.
Copyright © 1989 by the VLDB Endowment.
Permission to copy without fee all or part of this material is granted provided that the copies are not made or
distributed for direct commercial advantage, the VLDB
copyright notice and the title of the publication and
its date appear, and notice is given that copying
is by the permission of the Very Large Data Base
Endowment. To copy otherwise, or to republish, requires
a fee and/or special permission from the Endowment.
Online Paper
CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ...
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
Printed Edition
Peter M. G. Apers, Gio Wiederhold (Eds.):
Proceedings of the Fifteenth International Conference on Very Large Data Bases, August 22-25, 1989, Amsterdam, The Netherlands.
Morgan Kaufmann 1989, ISBN 1-55860-101-5
References
- [Ark84]
- ...
- [BK75]
- ...
- [Coc77]
- William G. Cochran:
Sampling Techniques, 3rd Edition.
John Wiley 1977, ISBN 0-471-16240-X
- [EN82]
- Jarmo Ernvall, Olli Nevalainen:
An Algorithm for Unbiased Random Sampling.
Comput. J. 25(1): 45-47(1982)
- [FMR62]
- ...
- [Gho86]
- Sakti P. Ghosh:
SIAM: statistics information access method.
Inf. Syst. 13(4): 359-368(1988)
- [HOT88]
- Wen-Chi Hou, Gultekin Özsoyoglu, Baldeo K. Taneja:
Statistical Estimators for Relational Algebra Expressions.
PODS 1988: 276-287
- [Knu73]
- Donald E. Knuth:
The Art of Computer Programming, Volume III: Sorting and Searching.
Addison-Wesley 1973, ISBN 0-201-03803-X
- [LTA79]
- ...
- [LWW84]
- ...
- [Mon85]
- ...
- [Pal85]
- Prashant Palvia:
Expressions for Batched Searching of Sequential and Hierarchical Files.
ACM Trans. Database Syst. 10(1): 97-106(1985)
- [SL88]
- Jaideep Srivastava, Vincent Y. Lum:
A Tree Based Access Method (TBSAM) for Fast Processing of Aggregate Queries.
ICDE 1988: 504-510
- [Vit84]
- Jeffrey Scott Vitter:
Faster Methods for Random Sampling.
Commun. ACM 27(7): 703-718(1984)
- [Vit85]
- Jeffrey Scott Vitter:
Random Sampling with a Reservoir.
ACM Trans. Math. Softw. 11(1): 37-57(1985)
- [WE80]
- C. K. Wong, Malcolm C. Easton:
An Efficient Method for Weighted Sampling Without Replacement.
SIAM J. Comput. 9(1): 111-113(1980)
- [Yao77]
- S. Bing Yao:
Approximating the Number of Accesses in Database Organizations.
Commun. ACM 20(4): 260-261(1977)
Copyright © Tue Mar 16 02:22:00 2010
by Michael Ley (ley@uni-trier.de)