Indexes for large collections are often divided into shards that are distributed across multiple computers and searched in parallel to provide rapid interactive search. Typically,...
In contrast to traditional document retrieval, a web page as a whole is not a good information unit to search because it often contains multiple topics and a lot of irrelevant inf...
Abstract. This paper describes the participation of the Technical University of Catalonia in the CLEF 2007 Question Answering on Speech Transcripts track. For the processing of man...
We propose a weakly-supervised approach for extracting class attributes from structured text available within Web documents. The overall precision of the extracted attributes is a...
The quality of an information retrieval system heavily depends on its retrieval function, which returns a similarity measurement between the query and each document in the collect...