Early TREC-style Question Answering Systems were characterized by the following features: (a) the answer of the question was known to be included in a given local corpus, (b) the ...
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Due to resource constraints, search engines usually have difficulties keeping the local database completely synchronized with the Web. To detect as many changes as possible, the ...
Qingzhao Tan, Ziming Zhuang, Prasenjit Mitra, C. L...
Recent research has had some success using the length of time a user displays a document in their web browser as implicit feedback for document preference. However, most studies h...
We present a principled methodology for filtering news stories by formal measures of information novelty, and show how the techniques can be used to custom-tailor newsfeeds based ...
Evgeniy Gabrilovich, Susan T. Dumais, Eric Horvitz