Sampling Large Databases for Association Rules.
Hannu Toivonen:
Sampling Large Databases for Association Rules.
VLDB 1996: 134-145@inproceedings{DBLP:conf/vldb/Toivonen96,
author = {Hannu Toivonen},
editor = {T. M. Vijayaraman and
Alejandro P. Buchmann and
C. Mohan and
Nandlal L. Sarda},
title = {Sampling Large Databases for Association Rules},
booktitle = {VLDB'96, Proceedings of 22th International Conference on Very
Large Data Bases, September 3-6, 1996, Mumbai (Bombay), India},
publisher = {Morgan Kaufmann},
year = {1996},
isbn = {1-55860-382-4},
pages = {134-145},
ee = {db/conf/vldb/Toivonen96.html},
crossref = {DBLP:conf/vldb/96},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Abstract
Discovery of association rules is an important database mining problem.
Current algorithms for finding association rules require several passes
over the analyzed database, and obviously the role of I/O overhead is
very significant for very large databases. We present new algorithms that
reduce the database activity considerably. The idea is to pick a random
sample, to find using this sample all association rules that probably
hold in the whole database, and then to verify the results with the rest
of the database. The algorithms thus produce exact association rules, not
approximations based on a sample. The approach is, however, probabilistic,
and in those rare cases where our sampling method does not produce all
association rules, the missing rules can be found in a second pass. Our
experiments show that the proposed algorithms can find association rules
very efficiently in only one database pass.
Copyright © 1996 by the VLDB Endowment.
Permission to copy without fee all or part of this material is granted provided that the copies are not made or
distributed for direct commercial advantage, the VLDB
copyright notice and the title of the publication and
its date appear, and notice is given that copying
is by the permission of the Very Large Data Base
Endowment. To copy otherwise, or to republish, requires
a fee and/or special permission from the Endowment.
Online Paper
CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ...
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
Printed Edition
T. M. Vijayaraman, Alejandro P. Buchmann, C. Mohan, Nandlal L. Sarda (Eds.):
VLDB'96, Proceedings of 22th International Conference on Very Large Data Bases, September 3-6, 1996, Mumbai (Bombay), India.
Morgan Kaufmann 1996, ISBN 1-55860-382-4
Contents
Electronic Edition
References
- [AIS93]
- Rakesh Agrawal, Tomasz Imielinski, Arun N. Swami:
Mining Association Rules between Sets of Items in Large Databases.
SIGMOD Conference 1993: 207-216
- [AMS+96]
- Rakesh Agrawal, Heikki Mannila, Ramakrishnan Srikant, Hannu Toivonen, A. Inkeri Verkamo:
Fast Discovery of Association Rules.
Advances in Knowledge Discovery and Data Mining 1996: 307-328
- [AS92]
- Noga Alon, Joel Spencer:
The Probabilistic Method.
John Wiley 1992, ISBN 0-471-53588-5
Contents - [AS94]
- Rakesh Agrawal, Ramakrishnan Srikant:
Fast Algorithms for Mining Association Rules in Large Databases.
VLDB 1994: 487-499
- [FPSSU96]
- Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, Ramasamy Uthurusamy (Eds.):
Advances in Knowledge Discovery and Data Mining.
AAAI/MIT Press 1996, ISBN 0-262-56097-6
Contents - [HF95]
- Jiawei Han, Yongjian Fu:
Discovery of Multiple-Level Association Rules from Large Databases.
VLDB 1995: 420-431
- [HKMT95]
- Marcel Holsheimer, Martin L. Kersten, Heikki Mannila, Hannu Toivonen:
A Perspective on Databases and Data Mining.
KDD 1995: 150-155
- [HS92]
- Peter J. Haas, Arun N. Swami:
Sequential Sampling Procedures for Query Size Estimation.
SIGMOD Conference 1992: 341-350
- [HS93]
- ...
- [KM94]
- Jyrki Kivinen, Heikki Mannila:
The Power of Sampling in Knowledge Discovery.
PODS 1994: 77-85
- [KMR+94]
- Mika Klemettinen, Heikki Mannila, Pirjo Ronkainen, Hannu Toivonen, A. Inkeri Verkamo:
Finding Interesting Rules from Large Sets of Discovered Association Rules.
CIKM 1994: 401-407
- [LSL95]
- Hongjun Lu, Rudy Setiono, Huan Liu:
NeuroRule: A Connectionist Approach to Data Mining.
VLDB 1995: 478-489
- [MT96]
- ...
- [MTV94]
- Heikki Mannila, Hannu Toivonen, A. Inkeri Verkamo:
Efficient Algorithms for Discovering Association Rules.
KDD Workshop 1994: 181-192
- [OR89]
- Frank Olken, Doron Rotem:
Random Sampling from B+ Trees.
VLDB 1989: 269-277
- [PCY95]
- Jong Soo Park, Ming-Syan Chen, Philip S. Yu:
An Effective Hash Based Algorithm for Mining Association Rules.
SIGMOD Conference 1995: 175-186
- [PSF91]
- Gregory Piatetsky-Shapiro, William J. Frawley (Eds.):
Knowledge Discovery in Databases.
AAAI/MIT Press 1991, ISBN 0-262-62080-4
Contents - [SA95]
- Ramakrishnan Srikant, Rakesh Agrawal:
Mining Generalized Association Rules.
VLDB 1995: 407-419
- [SON95]
- Ashok Savasere, Edward Omiecinski, Shamkant B. Navathe:
An Efficient Algorithm for Mining Association Rules in Large Databases.
VLDB 1995: 432-444
- [TKR+95]
- ...
Copyright © Tue Mar 16 02:22:05 2010
by Michael Ley (ley@uni-trier.de)