Seeking the Truth About ad hoc Join Costs.
Laura M. Haas, Michael J. Carey, Miron Livny, Amit Shukla:
Seeking the Truth About ad hoc Join Costs.
VLDB J. 6(3): 241-256(1997)@article{DBLP:journals/vldb/HaasCLS97,
author = {Laura M. Haas and
Michael J. Carey and
Miron Livny and
Amit Shukla},
title = {Seeking the Truth About ad hoc Join Costs},
journal = {VLDB J.},
volume = {6},
number = {3},
year = {1997},
pages = {241-256},
ee = {db/journals/vldb/HaasCLS97.html},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Abstract
In this paper, we re-examine the results of prior work on methods for computing ad hoc joins.
We develop a detailed cost model for predicting join algorithm performance, and we use the model to develop cost formulas for the major ad hoc join methods found in the relational database literature.
We show that various pieces of "common wisdom" about join algorithm performance fail to hold up when analyzed carefully, and we use our detailed cost model to derive optimal buffer allocation schemes for each of the join methods examined here.
We show that optimizing their buffer allocations can lead to large performance improvements, e.g., as much as a 400% improvement in some cases.
We also validate our cost model's predictions by measuring an actual implementation of each join algorithm considered.
The results of this work should be directly useful to implementors of relational query optimizers and query processing systems.
Key Words
Optimization, Cost models, Join methods, Buffer allocation, Performance
Copyright © 1997 by Springer, Berlin, Heidelberg.
Permission to make digital or hard copies of the abstract is granted provided that copies are not made or distributed for profit or
direct commercial advantage, and that copies show this notice along with the full citation.
Citation Page
CDROM Version: Load the CDROM "Volume 4 Issue 1, Books, VLDB-j, TODS, ..." and ...
DVD Version: Load ACM SIGMOD Anthology DVD 2" and ...
References
- [1]
- Mike W. Blasgen, Kapali P. Eswaran:
Storage and Access in Relational Data Bases.
IBM Systems Journal 16(4): 362-377(1977)

- [2]
- Kjell Bratbergsengen:
Hashing Methods and Relational Algebra Operations.
VLDB 1984: 323-333

- [3]
- ...
- [4]
- Michael J. Carey, Laura M. Haas, Miron Livny:
Tapes Hold Data, Too: Challenges of Tuples on Tertiary Store.
SIGMOD Conference 1993: 413-417

- [5]
- Diane L. Davison, Goetz Graefe:
Dynamic Resource Brokering for Multi-User Query Execution.
SIGMOD Conference 1995: 281-292

- [6]
- David J. DeWitt, Randy H. Katz, Frank Olken, Leonard D. Shapiro, Michael Stonebraker, David A. Wood:
Implementation Techniques for Main Memory Database Systems.
SIGMOD Conference 1984: 1-8

- [7]
- ...
- [8]
- Goetz Graefe:
Query Evaluation Techniques for Large Databases.
ACM Comput. Surv. 25(2): 73-170(1993)

- [9]
- Goetz Graefe, Ann Linville, Leonard D. Shapiro:
Sort versus Hash Revisited.
IEEE Trans. Knowl. Data Eng. 6(6): 934-944(1994)

- [10]
- Robert B. Hagmann:
An Observation on Database Buffering Performance Metrics.
VLDB 1986: 289-293

- [11]
- Yannis E. Ioannidis, Stavros Christodoulakis:
On the Propagation of Errors in the Size of Join Results.
SIGMOD Conference 1991: 268-277

- [12]
- Won Kim:
A New Way to Compute the Product and Join of Relations.
SIGMOD Conference 1980: 179-187

- [13]
- Donald E. Knuth:
The Art of Computer Programming, Volume III: Sorting and Searching.
Addison-Wesley 1973, ISBN 0-201-03803-X

- [14]
- ...
- [15]
- Lothar F. Mackert, Guy M. Lohman:
R* Optimizer Validation and Performance Evaluation for Local Queries.
SIGMOD Conference 1986: 84-95

- [16]
- Michael V. Mannino, Paicheng Chu, Thomas Sager:
Statistical Profile Estimation in Database Systems.
ACM Comput. Surv. 20(3): 191-221(1988)

- [17]
- Priti Mishra, Margaret H. Eich:
Join Processing in Relational Databases.
ACM Comput. Surv. 24(1): 63-113(1992)

- [18]
- HweeHwa Pang, Michael J. Carey, Miron Livny:
Partially Preemptive Hash Joins.
SIGMOD Conference 1993: 59-68

- [19]
- Jignesh M. Patel, Michael J. Carey, Mary K. Vernon:
Accurate Modeling of the Hybrid Hash Join Algorithm.
SIGMETRICS 1994: 56-66

- [20]
- Betty Salzberg:
Merging Sorted Runs Using Large Main Memory.
Acta Inf. 27(3): 195-215(1989)

- [21]
- Betty Salzberg, Alex Tsukerman, Jim Gray, Michael Stewart, Susan Uren, Bonnie Vaughan:
FastSort: A Distributed Single-Input Single-Output External Sort.
SIGMOD Conference 1990: 94-101

- [22]
- Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, Thomas G. Price:
Access Path Selection in a Relational Database Management System.
SIGMOD Conference 1979: 23-34

- [23]
- Eugene J. Shekita, Michael J. Carey:
A Performance Evaluation of Pointer-Based Joins.
SIGMOD Conference 1990: 300-311

- [24]
- Leonard D. Shapiro:
Join Processing in Database Systems with Large Main Memories.
ACM Trans. Database Syst. 11(3): 239-264(1986)

- [25]
- Patrick Valduriez:
Join Indices.
ACM Trans. Database Syst. 12(2): 218-246(1987)

- [26]
- ...
Copyright © Fri Mar 12 17:34:26 2010
by Michael Ley (ley@uni-trier.de)