Modern techniques for distributed information retrieval use a set of documents sampled from each server, but these samples have been underutilised in server selection. We describe...
Abstract. In this paper we describe a methodology for harvesting information from large distributed repositories (e.g. large Web sites) with minimum user intervention. The methodol...
Fabio Ciravegna, Sam Chapman, Alexiei Dingli, Yori...
Intuitively, any `bag of words' approach in IR should benefit from taking term dependencies into account. Unfortunately, for years the results of exploiting such dependencies ...
Eduard Hoenkamp, Peter Bruza, Dawei Song, Qiang Hu...
We reported some experiments conducted by our members in the SIG team at the IRIT laboratory in the CLEF medical retrieval task, namely ImageCLEFmed. In 2010, we are particularly i...
Traditionally, text classifiers are built from labeled training examples. Labeling is usually done manually by human experts (or the users), which is a labor intensive and time co...