We consider the coverage testing problem where we are given a document and a corpus with a limited query interface and asked to find if the corpus contains a near-duplicate of th...
Ali Dasdan, Paolo D'Alberto, Santanu Kolay, Chris ...
Pseudo-relevance feedback has proven to be an effective strategy for improving retrieval accuracy in all retrieval models. However the performance of existing pseudo feedback meth...
We are experiencing an unprecedented increase of content contributed by users in forums such as blogs, social networking sites and microblogging services. Such abundance of conten...
The basic aim of the model proposed here is to automatically build semantic metatext structure for texts that would allow us to search and extract discourse and semantic informati...
tion Abstract ChengXiang Zhai (Advisor: John Lafferty) Language Technologies Institute School of Computer Science Carnegie Mellon University With the dramatic increase in online in...