As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. The goal of this work i...
: In the last decade, many evaluation results have been created within the evaluation initiatives like TREC, NTCIR and CLEF. The large amount of data available has led to substanti...
Since 2002, INEX has been working towards the goal of establishing an infrastructure, in the form of a large XML test collection and appropriate scoring methods, for the evaluation...
The goal of system evaluation in information retrieval has always been to determine which of a set of systems is superior on a given collection. The tool used to determine system ...
Abstract. This paper examines technology developed to support largescale distributed digital libraries. We describe the method used for harvesting collection information using stan...