Information retrieval systems are evaluated against test collections of topics, documents, and assessments of which documents are relevant to which topics. Documents are chosen fo...
We present a probabilistic model for a document corpus that combines many of the desirable features of previous models. The model is called “GaP” for Gamma-Poisson, the distri...
The probability that a term appears in relevant documents ( ) is a fundamental quantity in several probabilistic retrieval models, however it is difficult to estimate without rele...
Maximizing only the relevance between queries and documents will not satisfy users if they want the top search results to present a wide coverage of topics by a few representative...
Yi Liu, Benyu Zhang, Zheng Chen, Michael R. Lyu, W...
In this short note we demonstrate the applicability of hyperlink downweighting by means of language model disagreement. The method filters out hyperlinks with no relevance to the ...