This paper presents a new method for building domain-specific web search engines. Previous methods eliminate irrelevant documents from the pages accessed using heuristics based on...
In this paper we describe the LIMSI Spoken Document Retrieval system used in the TREC-9 evaluation. This system combines an adapted version of the LIMSI 1999 Hub-4E transcription ...
Jean-Luc Gauvain, Lori Lamel, Claude Barras, Gille...
Many large-scale Web applications that require ranked top-k retrieval are implemented using inverted indices. An inverted index represents a sparse term-document matrix, where non...
George Beskales, Marcus Fontoura, Maxim Gurevich, ...
XML schema design has two opposing goals: elimination of update anomalies requires that the schema be as normalized as possible; yet higher query performance and simpler query exp...
Nuwee Wiwatwattana, H. V. Jagadish, Laks V. S. Lak...
Many modern natural language-processing applications utilize search engines to locate large numbers of Web documents or to compute statistics over the Web corpus. Yet Web search e...