Choosing an Efficient Internal Schema.

Frank Wm. Tompa: Choosing an Efficient Internal Schema. VLDB 1976: 65-77

@inproceedings{DBLP:conf/vldb/Tompa76,
  author    = {Frank Wm. Tompa},
  editor    = {Peter C. Lockemann and
               Erich J. Neuhold},
  title     = {Choosing an Efficient Internal Schema},
  booktitle = {Systems for Large Data Bases, September 8-10, 1976, Brussels,
               Belgium},
  publisher = {North Holland {\&} IFIP},
  year      = {1976},
  isbn      = {0-7204-0546-7},
  pages     = {65-77},
  ee        = {db/conf/vldb/Tompa76.html},
  crossref  = {DBLP:conf/vldb/76},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

The internal schema for a data base should not be concocted in an ad hoc manner, but rather result from an algorithmic design. This paper describes a methodology for choosing an efficient internal schema for a data base, given the conceptual schema and a target machine. It is assumed that the conceptual schema is composed of abstract data types which have associated "activity profiles" reflecting expected usage (e.g., expected set sizes and relative frequencies for insertions and deletions).

The methodology requires the construction of a library of permissible realizations for each abstract data type which may be used in any conceptual schema. Each member of the library is a cluster of code implementing the defined operations for one particular representation of a data type. (These clusters, which are invoked when a program accesses a data base, may be regarded as either open or closed subroutines.) Using the target machine's functional characteristics, a translator compiling the code can automatically generate parametric formulas which represent the expected run time and storage space used by the data that is to be manipulated by each cluster.

The application of the methodology itself starts with the construction of an evaluation matrix, for which the entries are derived from the library's parametric formulas by substituting values which reflect each component of the conceptual schema. Thus, for every occurrence of a data type in the conceptual schema, this matrix incorporates the expected run time and storage space required for each cluster which may be used to realize that type. The evaluation matrix is then searched using branch and bound to choose, for each data type occurrence, that cluster which minimizes the internal schema's overall cost.

The major advantage of this approach is that it includes a comparison of all internal representations which can be realized by superimposing members from a library. Therefore, a cluster which is traditionally used in one specific application, and which is therefore present in the library, will be considered for every occurrence of the corresponding type in the conceptual schema. Automation of the algorithm results in a thorough, yet efficient, evaluation of all alternatives, thus ensuring an efficient data base representation.

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 4, VLDB '75-'88" and ...

Windows: Click the letter of your CD drive
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Mac: Click here
UNIX/LINUX: mount the CD and click on the path of your mount point:
/Anthology/vldb7588 or /cdrom

DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...

Windows: Click the letter of your CD drive
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Mac: Click here
UNIX/LINUX: mount the DVD and click on the path of your mount point:
/Anthology/aDVD1 or /dvd

Printed Edition

Peter C. Lockemann, Erich J. Neuhold (Eds.): Systems for Large Data Bases, September 8-10, 1976, Brussels, Belgium. North Holland & IFIP 1976, ISBN 0-7204-0546-7
Contents

References

[ANSI75]: ...
[Cohen and Zuckerman74]: Jacques Cohen, Carl Zuckerman: Two Languages for Estimating Program Efficiency. Commun. ACM 17(6): 301-308(1974)
[Cohen74]: ...
[Dahl et al.68]: ...
[Gotlieb and Tompa74]: C. C. Gotlieb, Frank Wm. Tompa: Choosing a Storage Schema. Acta Inf. 3: 297-319(1974)
[Lawler and Wood66]: ...
[Liskov and Zilles74]: Barbara Liskov, Stephen N. Zilles: Programming with Abstract Data Types. SIGPLAN Notices 9(4): 50-59(1974)
[Tompa74]: ...
[Yao and Merten75]: S. Bing Yao, Alan G. Merten: Selection of File Organization Using an Analytic Model. VLDB 1975: 255-267