Sciweavers

SIGIR
2011
ACM

Pseudo test collections for learning web search ranking functions

12 years 7 months ago
Pseudo test collections for learning web search ranking functions
Test collections are the primary drivers of progress in information retrieval. They provide a yardstick for assessing the effectiveness of ranking functions in an automatic, rapid, and repeatable fashion and serve as training data for learning to rank approaches. However, manual construction of test collections tends to be slow, labor-intensive, and expensive. This paper examines the feasibility of constructing Web search test collections in a completely unsupervised manner given only a large Web corpus as input. Within the proposed framework, anchor text extracted from the Web graph is treated as a pseudo-query log from which pseudo queries are sampled. For each pseudo query, a set of relevant and non-relevant documents are selected using a variety of Webspecific features, including spam and aggregated anchor text weights. The automatically mined queries and judgments form a pseudo-test collection that can be used for evaluation or training learning to rank models. Experiments carr...
Nima Asadi, Donald Metzler, Tamer Elsayed, Jimmy L
Added 17 Sep 2011
Updated 17 Sep 2011
Type Journal
Year 2011
Where SIGIR
Authors Nima Asadi, Donald Metzler, Tamer Elsayed, Jimmy Lin
Comments (0)