: In the last decade, many evaluation results have been created within the evaluation initiatives like TREC, NTCIR and CLEF. The large amount of data available has led to substanti...
Unveiled in late 2004, Google Book Search is an ambitious program to make all the world's books discoverable online. The sheer scale of the problem brings a number of unique ...
Coreferencing entities across documents in a large corpus enables advanced document understanding tasks such as question answering. This paper presents a novel cross document core...
Jian Huang 0002, Sarah M. Taylor, Jonathan L. Smit...
Versioned document collections are collections that contain multiple versions of each document. Important examples are Web archives, Wikipedia and other wikis, or source code and ...
The Web is a dynamic, ever changing collection of information. This paper explores changes in Web content by analyzing a crawl of 55,000 Web pages, selected to represent different...
Eytan Adar, Jaime Teevan, Susan T. Dumais, Jonatha...