ACM SIGMOD Anthology VLDB dblp.uni-trier.de

Effective & Efficient Document Ranking without using a Large Lexicon.

Yasushi Ogawa: Effective & Efficient Document Ranking without using a Large Lexicon. VLDB 1996: 192-202
@inproceedings{DBLP:conf/vldb/Ogawa96,
  author    = {Yasushi Ogawa},
  editor    = {T. M. Vijayaraman and
               Alejandro P. Buchmann and
               C. Mohan and
               Nandlal L. Sarda},
  title     = {Effective {\&} Efficient Document Ranking without using a Large
               Lexicon},
  booktitle = {VLDB'96, Proceedings of 22th International Conference on Very
               Large Data Bases, September 3-6, 1996, Mumbai (Bombay), India},
  publisher = {Morgan Kaufmann},
  year      = {1996},
  isbn      = {1-55860-382-4},
  pages     = {192-202},
  ee        = {db/conf/vldb/Ogawa96.html},
  crossref  = {DBLP:conf/vldb/96},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

Although a word-based method is commonly used in document retrieval, it cannot be directly applicable to languages that have no obvious word separator. Given a lexicon, it is possible to identify words in documents, but a large lexicon is troublesome to maintain and makes retrieval systems large and complicated. This paper proposes an effective and efficient ranking that does not use a large lexicon; words need not be identified during document registration because a character-based signature file is used for the access structure. A user request, during document retrieval, is statistically analyzed to generate an appropriate query, and the query is evaluated efficiently in a word-based manner using the character-based index. We also propose two optimizing techniques to accelerate retrieval.

Copyright © 1996 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.


Online Paper

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...

Printed Edition

T. M. Vijayaraman, Alejandro P. Buchmann, C. Mohan, Nandlal L. Sarda (Eds.): VLDB'96, Proceedings of 22th International Conference on Very Large Data Bases, September 3-6, 1996, Mumbai (Bombay), India. Morgan Kaufmann 1996, ISBN 1-55860-382-4
Contents CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

References

[1]
Eric W. Brown: Fast Evaluation of Structured Queries for Information Retrieval. SIGIR 1995: 30-38 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[2]
Chris Buckley, A. F. Lewit: Optimization of Inverted Vector Searches. SIGIR 1985: 97-110 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[3]
Lee-Feng Chien: Fast and Quasi-Natural Language Search for Gigabits of Chinese Texts. SIGIR 1995: 112-120 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[4]
W. Bruce Croft, Pasquale Savino: Implementing Ranking Strategies Using Text Signatures. ACM Trans. Inf. Syst. 6(1): 42-62(1988) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[5]
William B. Frakes, Ricardo A. Baeza-Yates (Eds.): Information Retrieval: Data Structures & Algorithms. Prentice-Hall 1992, ISBN 0-13-463837-9
Contents CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[6]
Hideo Fujii, W. Bruce Croft: A Comparison of Indexing Techniques for Japanese Text Retrieval. SIGIR 1993: 237-246 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[7]
...
[8]
...
[9]
...
[10]
Marti A. Hearst, Christian Plaunt: Subtopic Structuring for Full-Length Document Access. SIGIR 1993: 59-68 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[11]
...
[12]
...
[13]
...
[14]
...
[15]
...
[16]
...
[17]
...
[18]
Alistair Moffat, Justin Zobel: Fast Ranking in Limited Space. ICDE 1994: 428-437 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[19]
...
[20]
Yasushi Ogawa, Ayako Bessho, Masajirou Iwasaki, M. Nishimura, Masako Hirose: A New Indexing and Text Ranking Method for Japanese Text Databases Using Simple-Word Compounds as Keywords. DASFAA 1993: 197-204 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[21]
Yasushi Ogawa, Masajirou Iwasaki: A New Character-based Indexing Organization using Frequency Data for Japanese Documents. SIGIR 1995: 121-129 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[22]
...
[23]
Stephen E. Robertson, Steve Walker: Some Simple Effective Approximations to the 2-Poisson Model for Probabilistic Weighted Retrieval. SIGIR 1994: 232-241 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[24]
...
[25]
...
[26]
Peter Schäuble: SPIDER: A Multiuser Information Retrieval System for Semistructured and Dynamic Data. SIGIR 1993: 318-327 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[27]
...
[28]
...
[29]
...
[30]
...
[31]
...
[32]
...

Copyright © Tue Mar 16 02:22:05 2010 by Michael Ley (ley@uni-trier.de)