Sciweavers

ACSC
2000
IEEE

Needles and Haystacks: A Search Engine for Personal Information Collections

13 years 9 months ago
Needles and Haystacks: A Search Engine for Personal Information Collections
Information retrieval systems can be partitioned into two main classes: large-scale systems that make use of an inverted index or some other auxiliary data structure, intended for massive volumes of data; and the small-scale systems based upon sequential pattern matching that most computer users employ when hunting for missing email and news items. In this paper we describe a hybrid approach that offers the ranked queries and similarity matching of a genuine information retrieval system, but does so without any need for an index to be precomputed. This software tool, which we call seft, offers performance that in a retrieval effectiveness sense matches conventional information retrieval systems, and in a resource efficiency sense, while considerably slower than grep-like tools, is fast enough to be useful on hundreds of megabytes of text.
Owen de Kretser, Alistair Moffat
Added 30 Jul 2010
Updated 30 Jul 2010
Type Conference
Year 2000
Where ACSC
Authors Owen de Kretser, Alistair Moffat
Comments (0)