@inproceedings{DBLP:conf/vldb/KitsuregawaNT89, author = {Masaru Kitsuregawa and Masaya Nakayama and Mikio Takagi}, editor = {Peter M. G. Apers and Gio Wiederhold}, title = {The Effect of Bucket Size Tuning in the Dynamic Hybrid GRACE Hash Join Method}, booktitle = {Proceedings of the Fifteenth International Conference on Very Large Data Bases, August 22-25, 1989, Amsterdam, The Netherlands}, publisher = {Morgan Kaufmann}, year = {1989}, isbn = {1-55860-101-5}, pages = {257-266}, ee = {db/conf/vldb/KitsuregawaNT89.html}, crossref = {DBLP:conf/vldb/89}, bibsource = {DBLP, http://dblp.uni-trier.de} }
In this paper, we show detailed analysis and performance evaluation of the Dynamic Hybrid GRACE Hash Join Method (DHGH Method) when the tuple distribution in buckets is unbalanced.
The conventional Hash Join Methods specify the tuple distribution in buckets statically. However it may differ from estimation since join operations are appliedwith selection operations. When the tuple distribution in buckets is unbalanced, the processing cost of join operation becomes more costly than the ideal case when you use Hybrid Hash Join Method (HH Method). On the other hand, when you use the DHGH Method, the destaging buckets are selected dynamically, gives the same performance as the ideal case even if the tuple distribution in buckets is unbalanced such as Zipf-like distributions.
We analyze the total I/O cost of a join operation at various number of buckets. The result shows that we have to determine the number of buckets based on the tuple distribution in buckets rather than the size of the sourcerelation. It is shown that we had better partition the source relation using a large number of small buckets instead of the smaller number of buckets almost filling the whole main memory adopted in the HH Method.
Copyright © 1989 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.