Sciweavers

ERCIMDL
1997
Springer

Scalable Text Retrieval for Large Digital Libraries

14 years 4 months ago
Scalable Text Retrieval for Large Digital Libraries
It is argued that digital libraries of the future will contain terabyte-scale collections of digital text and that full-text searching techniques will be required to operate over collections of this magnitude. Algorithms expected to be capable of scaling to these data sizes using clusters of modern workstations are described. First, basic indexing and retrieval algorithms operating at performance levels comparable to other leading systems over gigabytes of text on a single workstation are presented. Next, simple mechanisms for extending query processing capacity to much greater collection sizes are presented, to tens of gigabytes for single workstations and to terabytes for clusters of such workstations. Query-processing eciency on a single workstation is shown to deteriorate dramatically when data size is increased above a certain multiple of physical memory size. By contrast, the number of clustered workstations necessary to maintain a constant level of service increases linearly wi...
David Hawking
Added 07 Aug 2010
Updated 07 Aug 2010
Type Conference
Year 1997
Where ERCIMDL
Authors David Hawking
Comments (0)