Sciweavers

SIGIR
2008
ACM

Evaluation over thousands of queries

13 years 4 months ago
Evaluation over thousands of queries
Information retrieval evaluation has typically been performed over several dozen queries, each judged to near-completeness. There has been a great deal of recent work on evaluation over much smaller judgment sets: how to select the best set of documents to judge and how to estimate evaluation measures when few judgments are available. In light of this, it should be possible to evaluate over many more queries without much more total judging effort. The Million Query Track at TREC 2007 used two document selection algorithms to acquire relevance judgments for more than 1,800 queries. We present results of the track, along with deeper analysis: investigating tradeoffs between the number of queries and number of judgments shows that, up to a point, evaluation over more queries with fewer judgments is more costeffective and as reliable as fewer queries with more judgments. Total assessor effort can be reduced by 95% with no appreciable increase in evaluation errors. Categories and Subject D...
Ben Carterette, Virgiliu Pavlu, Evangelos Kanoulas
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2008
Where SIGIR
Authors Ben Carterette, Virgiliu Pavlu, Evangelos Kanoulas, Javed A. Aslam, James Allan
Comments (0)