Clustering Techniques for Minimizing External Path Length.
Ajit A. Diwan, Sanjeeva Rane, S. Seshadri, S. Sudarshan:
Clustering Techniques for Minimizing External Path Length.
VLDB 1996: 342-353@inproceedings{DBLP:conf/vldb/DiwanRSS96,
author = {Ajit A. Diwan and
Sanjeeva Rane and
S. Seshadri and
S. Sudarshan},
editor = {T. M. Vijayaraman and
Alejandro P. Buchmann and
C. Mohan and
Nandlal L. Sarda},
title = {Clustering Techniques for Minimizing External Path Length},
booktitle = {VLDB'96, Proceedings of 22th International Conference on Very
Large Data Bases, September 3-6, 1996, Mumbai (Bombay), India},
publisher = {Morgan Kaufmann},
year = {1996},
isbn = {1-55860-382-4},
pages = {342-353},
ee = {db/conf/vldb/DiwanRSS96.html},
crossref = {DBLP:conf/vldb/96},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Abstract
There are a variety of main-memory access structures,
such as segment trees, and quad trees, whose properties,
such as good worst-case behaviour, make them attractive for
database applications.
Unfortunately, the structures are typically `long and skinny', whereas
disk data structures must be `short-and-fat' (that is, have a high
fanout and low height) in order to minimize I/O.
We consider how to cluster the nodes (that is, map the nodes to disk pages)
of mainmemory access structures such that
that although a path may traverse many nodes, it only traverses a few
disk pages. The number of disk pages traversed in a path is called the
external path length. We address several version of the clustering problem.
We present a clustering algorithm for tree structures that generates
optimal worst-case external path length mappings;
we also show how to make it dynamic, to support updates.
We extend the algorithm to generate mappings
that minimize the average weighted external path lengths.
We also show that some other clustering problems,
such as finding optimal external path lengths
for DAG structures and minimizing weights for optimal height mappings,
are NP-complete. We present efficient heurisitcs for these problem.
We present a performance study (using quad-trees on
actual image data as an example)
which shows that our algorithms perform well.
Our algorithms can also be used for clustering complex objects in
object-oriented databases.
Copyright © 1996 by the VLDB Endowment.
Permission to copy without fee all or part of this material is granted provided that the copies are not made or
distributed for direct commercial advantage, the VLDB
copyright notice and the title of the publication and
its date appear, and notice is given that copying
is by the permission of the Very Large Data Base
Endowment. To copy otherwise, or to republish, requires
a fee and/or special permission from the Endowment.
Online Paper
CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ...
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
Printed Edition
T. M. Vijayaraman, Alejandro P. Buchmann, C. Mohan, Nandlal L. Sarda (Eds.):
VLDB'96, Proceedings of 22th International Conference on Very Large Data Bases, September 3-6, 1996, Mumbai (Bombay), India.
Morgan Kaufmann 1996, ISBN 1-55860-382-4
Contents
Electronic Edition
References
- [BKKG88]
- Jay Banerjee, Won Kim, Sung-Jo Kim, Jorge F. Garza:
Clustering a DAG for CAD Databases.
IEEE Trans. Software Eng. 14(11): 1684-1699(1988)
- [BKSS90]
- Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider, Bernhard Seeger:
The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles.
SIGMOD Conference 1990: 322-331
- [Ch91]
- Jia-bing R. Cheng, Ali R. Hurson:
Effective Clustering of Complex Objects in Object-Oriented Databases.
SIGMOD Conference 1991: 22-31
- [CK89]
- Ellis E. Chang, Randy H. Katz:
Exploiting Inheritance and Structure Semantics for Effective Clustering and Buffering in an Object-Oriented DBMS.
SIGMOD Conference 1989: 348-357
- [DSST86]
- James R. Driscoll, Neil Sarnak, Daniel Dominic Sleator, Robert Endre Tarjan:
Making Data Structures Persistent.
STOC 1986: 109-121
- [Gut84]
- Antonin Guttman:
R-Trees: A Dynamic Index Structure for Spatial Searching.
SIGMOD Conference 1984: 47-57
- [Jag90]
- H. V. Jagadish:
Linear Clustering of Objects with Multiple Atributes.
SIGMOD Conference 1990: 332-342
- [KTF95]
- Anil Kumar, Vassilis J. Tsotras, Christos Faloutsos:
Access Methods for Bi-Temporal Databases.
Temporal Databases 1995: 235-254
- [LS92]
- Qing Li, John L. Smith:
A Conceptual Model for Dynamic Clustering in Object Databases.
VLDB 1992: 457-468
- [NGV93]
- Mark H. Nodine, Michael T. Goodrich, Jeffrey Scott Vitter:
Blocking for External Graph Searching.
PODS 1993: 222-232
- [Ore89]
- Jack A. Orenstein:
Redundancy in Spatial Databases.
SIGMOD Conference 1989: 295-305
- [Ore90]
- Jack A. Orenstein:
A Comparison of Spatial Query Processing Techniques for Native and Parameter Spaces.
SIGMOD Conference 1990: 343-352
- [RS94]
- Sridhar Ramaswamy, Sairam Subramanian:
Path Caching: A Technique for Optimal External Searching.
PODS 1994: 25-35
- [Sam95]
- Hanan Samet:
Spatial Data Structures.
Modern Database Systems 1995: 361-385
- [SB93]
- Clifford A. Shaffer, Patrick R. Brown:
A Paging Scheme for Pointer-Based Quadtrees.
SSD 1993: 89-104
- [Sch77]
- Mario Schkolnick:
A Clustering Algorithm for Hierarchical Structures.
ACM Trans. Database Syst. 2(1): 27-44(1977)
- [SRF87]
- Timos K. Sellis, Nick Roussopoulos, Christos Faloutsos:
The R+-Tree: A Dynamic Index for Multi-Dimensional Objects.
VLDB 1987: 507-518
- [SSN90]
- Clifford A. Shaffer, Hanan Samet, Randal C. Nelson:
QUILT: A Geographic Information System based on Quadtrees.
IJGIS 4(2): 103-131(1990)
- [TN91]
- Manolis M. Tsangaris, Jeffrey F. Naughton:
A Stochastic Approach for Clustering in Object Bases.
SIGMOD Conference 1991: 12-21
- [TN92]
- Manolis M. Tsangaris, Jeffrey F. Naughton:
On the Performance of Object Clustering Techniques.
SIGMOD Conference 1992: 144-153
Copyright © Tue Mar 16 02:22:05 2010
by Michael Ley (ley@uni-trier.de)