Active Storage for Large-Scale Data Mining and Multimedia.
Erik Riedel, Garth A. Gibson, Christos Faloutsos:
Active Storage for Large-Scale Data Mining and Multimedia.
VLDB 1998: 62-73@inproceedings{DBLP:conf/vldb/RiedelGF98,
author = {Erik Riedel and
Garth A. Gibson and
Christos Faloutsos},
editor = {Ashish Gupta and
Oded Shmueli and
Jennifer Widom},
title = {Active Storage for Large-Scale Data Mining and Multimedia},
booktitle = {VLDB'98, Proceedings of 24rd International Conference on Very
Large Data Bases, August 24-27, 1998, New York City, New York,
USA},
publisher = {Morgan Kaufmann},
year = {1998},
isbn = {1-55860-566-5},
pages = {62-73},
ee = {db/conf/vldb/RiedelGF98.html},
crossref = {DBLP:conf/vldb/98},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Abstract
The increasing performance and decreasing cost of processors and memory are causing system intelligence to move into peripherals from the CPU.
Storage system designers are using this trend toward "excess" compute power to perform more complex processing and optimizations inside storage devices.
To date, such optimizations have been at relatively low levels of the storage protocol.
At the same time, trends in storage density, mechanics, and electronics are eliminating the bottleneck in moving data off the media and putting pressure on interconnects and host processors to move data more efficiently.
We propose a system called Active Disks that takes advantage of processing power on individual disk drives to run application-level code.
Moving portions of an application's processing to execute directly at diskdrives can dramatically reduce data traffic and take advantage of the storage parallelism already present in large systems today.
We discuss several types of applications that would benefit from this capability with a focus on the areas of database, data mining, and multimedia.
We develop an analytical model of the speed- ups possible for scan-intensive applications in an Active Disk system.
We also experiment with a prototype Active Disk system using relatively low-powered processors in comparison to a database server system with a single, fast processor.
Our experiments validate the intuition in our model and demonstrate speedups of 2x on 10 disks across four scan-based applications.
The model promises linear speedups in disk arrays of hundreds of disks, provided the application data is large enough.
Copyright © 1998 by the VLDB Endowment.
Permission to copy without fee all or part of this material is granted provided that the copies are not made or
distributed for direct commercial advantage, the VLDB
copyright notice and the title of the publication and
its date appear, and notice is given that copying
is by the permission of the Very Large Data Base
Endowment. To copy otherwise, or to republish, requires
a fee and/or special permission from the Endowment.
Online Paper
CDROM Version: Load the CDROM "DiSC, Volume 1 Number 1" and ...
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
Printed Edition
Ashish Gupta, Oded Shmueli, Jennifer Widom (Eds.):
VLDB'98, Proceedings of 24rd International Conference on Very Large Data Bases, August 24-27, 1998, New York City, New York, USA.
Morgan Kaufmann 1998, ISBN 1-55860-566-5
Contents
References
- [Acharya98]
- ...
- [Agrawal95]
- Rakesh Agrawal, Ramakrishnan Srikant:
Fast Algorithms for Mining Association Rules in Large Databases.
VLDB 1994: 487-499
- [Agrawal96]
- Rakesh Agrawal, John C. Shafer:
Parallel Mining of Association Rules.
IEEE Trans. Knowl. Data Eng. 8(6): 962-969(1996)
- [Almaden97]
- ...
- [Arpaci Dusseau97]
- Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, David E. Culler, Joseph M. Hellerstein, David A. Patterson:
High-Performance Sorting on Networks of Workstations.
SIGMOD Conference 1997: 243-254
- [Arya94]
- Manish Arya, William F. Cody, Christos Faloutsos, Joel E. Richardson, Arthur Toya:
QBISM: Extending a DBMS to Support 3D Medical Images.
ICDE 1994: 314-325
- [Barclay97]
- ...
- [Berchtold96]
- Stefan Berchtold, Daniel A. Keim, Hans-Peter Kriegel:
The X-tree : An Index Structure for High-Dimensional Data.
VLDB 1996: 28-39
- [Berchtold97]
- Stefan Berchtold, Christian Böhm, Daniel A. Keim, Hans-Peter Kriegel:
A Cost Model For Nearest Neighbor Search in High-Dimensional Data Space.
PODS 1997: 78-86
- [Bershad95]
- Brian N. Bershad, Stefan Savage, Przemyslaw Pardyak, Emin Gün Sirer, Marc E. Fiuczynski, David Becker, Craig Chambers, Susan J. Eggers:
Extensibility, Safety and Performance in the SPIN Operating System.
SOSP 1995: 267-284
- [Bitton88]
- Dina Bitton, Jim Gray:
Disk Shadowing.
VLDB 1988: 331-338
- [Blelloch98]
- ...
- [Boral83]
- Haran Boral, David J. DeWitt:
Database Machines: An Idea Whose Time Passed? A Critique of the Future of Database Machines.
IWDM 1983: 166-187
- [Cao94]
- Pei Cao, Swee Boon Lim, Shivakumar Venkataraman, John Wilkes:
The TickerTAIP Parallel RAID Architecture.
ACM Trans. Comput. Syst. 12(3): 236-269(1994)
- [DeWitt81]
- David J. DeWitt, Paula B. Hawthorn:
A Performance Evaluation of Data Base Machine Architectures (Invited Paper).
VLDB 1981: 199-214
- [DeWitt85]
- David J. DeWitt, Robert H. Gerber:
Multiprocessor Hash-Based Join Algorithms.
VLDB 1985: 151-164
- [DeWitt91]
- David J. DeWitt, Jeffrey F. Naughton, Donovan A. Schneider:
Parallel Sorting on a Shared-Nothing Architecture using Probabilistic Splitting.
PDIS 1991: 280-291
- [DeWitt92]
- David J. DeWitt, Jim Gray:
Parallel Database Systems: The Future of High Performance Database Systems.
Commun. ACM 35(6): 85-98(1992)
- [Drapeau94]
- Ann L. Drapeau, Ken Shirriff, John H. Hartman, Ethan L. Miller, Srinivasan Seshan, Randy H. Katz, Ken Lutz, David A. Patterson, Edward K. Lee, Peter M. Chen, Garth A. Gibson:
RAID-II: A High-Bandwidth Network File Server.
ISCA 1994: 234-244
- [Faloutsos94]
- Christos Faloutsos, Ron Barber, Myron Flickner, Jim Hafner, Wayne Niblack, Dragutin Petkovic, William Equitz:
Efficient and Effective Querying by Image Content.
J. Intell. Inf. Syst. 3(3/4): 231-262(1994)
- [Faloutsos96]
- ...
- [Flickner95]
- Myron Flickner, Harpreet S. Sawhney, Jonathan Ashley, Qian Huang, Byron Dom, Monika Gorkani, Jim Hafner, Denis Lee, Dragutin Petkovic, David Steele, Peter Yanker:
Query by Image and Video Content: The QBIC System.
IEEE Computer 28(9): 23-32(1995)
- [Gibson97]
- Garth A. Gibson, David Nagle, Khalil Amiri, Fay W. Chang, Eugene M. Feinberg, Howard Gobioff, Chen Lee, Berend Ozceri, Erik Riedel, David Rochberg, Jim Zelenka:
File Server Scaling with Network-Attached Secure Disks.
SIGMETRICS 1997: 272-284
- [Gibson98]
- ...
- [Gosling96]
- James Gosling, William N. Joy, Guy L. Steele Jr.:
The Java Language Specification.
Addison-Wesley 1996, ISBN 0-201-63451-1
- [Gray97]
- ...
- [Grochowski96]
- ...
- [Hsiao79]
- ...
- [Keeton98]
- ...
- [Kitsuregawa83]
- Masaru Kitsuregawa, Hidehiko Tanaka, Tohru Moto-Oka:
Application of Hash to Data Base Machine and Its Architecture.
New Generation Comput. 1(1): 63-74(1983)
- [Kotz94]
- David Kotz:
Disk-directed I/O for MIMD Multiprocessors.
OSDI 1994: 61-74
- [Lee96]
- Edward K. Lee, Chandramohan A. Thekkath:
Petal: Distributed Virtual Disks.
ASPLOS 1996: 84-92
- [Livny87]
- Miron Livny, Setrag Khoshafian, Haran Boral:
Multi-Disk Management Algorithms.
SIGMETRICS 1987: 69-77
- [Necula96]
- George C. Necula, Peter Lee:
Safe Kernel Extensions Without Run-Time Checking.
OSDI 1996: 229-243
- [Ozharahan75]
- ...
- [Patterson88]
- David A. Patterson, Garth A. Gibson, Randy H. Katz:
A Case for Redundant Arrays of Inexpensive Disks (RAID).
SIGMOD Conference 1988: 109-116
- [Patterson95]
- R. Hugo Patterson, Garth A. Gibson, Eka Ginting, Daniel Stodolsky, Jim Zelenka:
Informed Prefetching and Caching.
SOSP 1995: 79-95
- [Quest97]
- ...
- [Riedel97]
- ...
- [Romer96]
- Theodore H. Romer, Dennis Lee, Geoffrey M. Voelker, Alec Wolman, Wayne A. Wong, Jean-Loup Baer, Brian N. Bershad, Henry M. Levy:
The Structure and Performance of Interpreters.
ASPLOS 1996: 150-159
- [Ruemmler91]
- ...
- [Seagate97]
- ...
- [Small95]
- ...
- [Smith79]
- ...
- [Smith95]
- ...
- [StorageTek94]
- ...
- [Su75]
- Stanley Y. W. Su, G. Jack Lipovski:
CASSM: A Cellular System for Very Large Data Bases.
VLDB 1975: 456-472
- [TPC98]
- ...
- [TriCore97]
- ...
- [Turley96]
- ...
- [VanMeter96]
- ...
- [Virage98]
- ...
- [Wactlar96]
- Howard D. Wactlar, Takeo Kanade, Michael A. Smith, Scott M. Stevens:
Intelligent Access to Digital Video: Informedia Project.
IEEE Computer 29(5): 46-52(1996)
- [Wahbe93]
- Robert Wahbe, Steven Lucco, Thomas E. Anderson, Susan L. Graham:
Efficient Software-Based Fault Isolation.
SOSP 1993: 203-216
- [Welling98]
- ...
- [Wilkes95]
- John Wilkes, Richard A. Golding, Carl Staelin, Tim Sullivan:
The HP AutoRAID Hierarchical Storage System.
SOSP 1995: 96-108
- [Yao85]
- Andrew Chi-Chih Yao, F. Frances Yao:
A General Approach to d-Dimensional Geometric Queries (Extended Abstract).
STOC 1985: 163-168
Copyright © Fri Mar 12 17:22:56 2010
by Michael Ley (ley@uni-trier.de)