Beyond Uniformity and Independence: Analysis of R-trees Using the Concept of Fractal Dimension.
Christos Faloutsos, Ibrahim Kamel:
We propose the concept of fractal dimension of a set of points, in
order to quantify the deviation from the uniformity distribution.
Using measurements on real data sets (road intersections of U.S.
counties, star coordinates from NASA's Infrared-Ultraviolet Explorer
etc.) we provide evidence that real data indeed are skewed, and,
moreover, we show that they behave as mathematical fractals, with a
measurable, non-integer fractal dimension.
Armed with this tool, we then show its practical use in predicting the
performance of spatial access methods, and specifically of the
R-trees. We provide the first analysis of R-trees for skewed
distributions of points: We develop a formula that estimates the
number of disk accesses for range queries, given only the fractal
dimension of the point set, and its count. Experiments on real data
sets show that the formula is very accurate: the relative error is
usually below 5%, and it rarely exceeds 10%.
We believe that the fractal dimension will help replace the uniformity
and independence assumptions, allowing more accurate analysis for any
spatial access method, as well as better estimates for query
optimization on multi-attribute queries.
Journal Edition
Christos Faloutsos, Ibrahim Kamel:
Relaxing the Uniformity and Independence Assumptions Using the Concept of Fractal Dimension.
J. Comput. Syst. Sci. 55(2): 229-240(1997)
