Sciweavers

ICTIR
2009
Springer

Modeling the Score Distributions of Relevant and Non-relevant Documents

13 years 11 months ago
Modeling the Score Distributions of Relevant and Non-relevant Documents
Empirical modeling of the score distributions associated with retrieved documents is an essential task for many retrieval applications. In this work, we propose modeling the relevant documents’ scores by a mixture of Gaussians and modeling the non-relevant scores by a Gamma distribution. Applying variational inference we automatically trade-off the goodness-of-fit with the complexity of the model. We test our model on traditional retrieval functions and actual search engines submitted to TREC. We demonstrate the utility of our model in inferring precisionrecall curves. In all experiments our model outperforms the dominant exponential-Gaussian model.
Evangelos Kanoulas, Virgiliu Pavlu, Keshi Dai, Jav
Added 26 May 2010
Updated 26 May 2010
Type Conference
Year 2009
Where ICTIR
Authors Evangelos Kanoulas, Virgiliu Pavlu, Keshi Dai, Javed A. Aslam
Comments (0)