ACM SIGMOD Anthology ACM SIGMOD dblp.uni-trier.de

Memory Management During Run Generation in External Sorting.

Per-Åke Larson, Goetz Graefe: Memory Management During Run Generation in External Sorting. SIGMOD Conference 1998: 472-483
@inproceedings{DBLP:conf/sigmod/LarsonG98,
  author    = {Per-{\AA}ke Larson and
               Goetz Graefe},
  editor    = {Laura M. Haas and
               Ashutosh Tiwary},
  title     = {Memory Management During Run Generation in External Sorting},
  booktitle = {SIGMOD 1998, Proceedings ACM SIGMOD International Conference
               on Management of Data, June 2-4, 1998, Seattle, Washington, USA},
  publisher = {ACM Press},
  year      = {1998},
  isbn      = {0-89791-995-5},
  pages     = {472-483},
  ee        = {http://doi.acm.org/10.1145/276304.276346, db/conf/sigmod/LarsonG98.html},
  crossref  = {DBLP:conf/sigmod/98},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

If replacement selection is used in an external mergesort to generate initial runs, individual records are deleted and inserted in the sort operation's workspace. Variable-length records introduce the need for possibly complex memory management and extra copying of records. As a result, few systems employ replacement selection, even though it produces longer runs than commonly used algorithms. We experimentally compared several algorithms and variants for managing this workspace. We found that the simple best fit algorithm achieves memory utilization of 90% or better and run lengths over 1.8 times workspace size, with no extra copying of records and very little other overhead, for widely varying record sizes and for a wide range of memory sizes. Thus, replacement selection is a viable algorithm for commercial database systems, even for variable-length records.

Efficient memory management also enables an external sort algorithm that degrades gracefully when its input is only slightly larger than or a small multiple of the available memory size. This is not the case with the usual implementations of external sorting, which incur I/O for the entire input even if it is as little as one record larger than memory. Thus, in some cases, our techniques may reduce I/O volume by a factor 10 compared to traditional database sort algorithms. Moreover, the gradual rather than abrupt growth in I/O volume for increasing input sizes significantly eases design and implementation of intra-query memory management policies.

Copyright © 1998 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD DiSC

CDROM Version: Load the CDROM "DiSC, Volume 1 Number 1" and ... Online Version (ACM WWW Account required): Full Text in PDF Format

ACM SIGMOD Anthology

DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...

Printed Edition

Laura M. Haas, Ashutosh Tiwary (Eds.): SIGMOD 1998, Proceedings ACM SIGMOD International Conference on Management of Data, June 2-4, 1998, Seattle, Washington, USA. ACM Press 1998, ISBN 0-89791-995-5 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML, SIGMOD Record 27(2), June 1998
Contents

Online Edition: ACM SIGMOD

[Abstract]
[Full Text (Postscript)]

References

[1]
Vladimir Estivill-Castro, Derick Wood: A Survey of Adaptive Sorting Algorithms. ACM Comput. Surv. 24(4): 441-476(1992) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[2]
Goetz Graefe: Query Evaluation Techniques for Large Databases. ACM Comput. Surv. 25(2): 73-170(1993) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[3]
Goetz Graefe: Sort-Merge-Join: An Idea Whose Time Has(h) Passed? ICDE 1994: 406-417 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[4]
Masaya Nakayama, Masaru Kitsuregawa, Mikio Takagi: Hash-Partitioned Join Method Using Dynamic Destaging Strategy. VLDB 1988: 468-478 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[5]
Donald E. Knuth: The Art of Computer Programming, Volume I: Fundamental Algorithms, 2nd Edition. Addison-Wesley 1973
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[6]
Donald E. Knuth: The Art of Computer Programming, Volume III: Sorting and Searching. Addison-Wesley 1973, ISBN 0-201-03803-X
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[7]
...
[8]
Vinay S. Pai, Peter J. Varman: Prefetching with Multiple Disks for External Mergesort: Simulation and Analysis. ICDE 1992: 273-282 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[9]
Betty Salzberg: Merging Sorted Runs Using Large Main Memory. Acta Inf. 27(3): 195-215(1989) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[10]
Paul R. Wilson, Mark S. Johnstone, Michael Neely, David Boles: Dynamic Storage Allocation: A Survey and Critical Review. IWMM 1995: 1-116 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[11]
LuoQuan Zheng, Per-Åke Larson: Speeding up External Mergesort. IEEE Trans. Knowl. Data Eng. 8(2): 322-332(1996) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

Copyright © Mon Mar 15 03:54:36 2010 by Michael Ley (ley@uni-trier.de)