Sciweavers

CLEF
2010
Springer

MapReduce for Information Retrieval Evaluation: "Let's Quickly Test This on 12 TB of Data"

13 years 4 months ago
MapReduce for Information Retrieval Evaluation: "Let's Quickly Test This on 12 TB of Data"
We propose to use MapReduce to quickly test new retrieval approaches on a cluster of machines by sequentially scanning all documents. We present a small case study in which we use a cluster of 15 low cost machines to search a web crawl of 0.5 billion pages showing that sequential scanning is a viable approach to running large-scale information retrieval experiments with little effort. The code is available to other researchers at: http://mirex.sourceforge.net
Djoerd Hiemstra, Claudia Hauff
Added 08 Nov 2010
Updated 08 Nov 2010
Type Conference
Year 2010
Where CLEF
Authors Djoerd Hiemstra, Claudia Hauff
Comments (0)