Information Retrieval as Statistical Translation

11 years 11 months ago
Information Retrieval as Statistical Translation
We propose a new probabilistic approach to information retrieval based upon the ideas and methods of statistical machine translation. The central ingredient in this approach is a statistical model of how a user might distill or \translate" a given document into a query. To assess the relevance of a document to a user's query, we estimate the probability that the query would have been generated as a translation of the document, and factor in the user's general preferences in the form of a prior distribution over documents. We propose a simple, well motivated model of the document-to-query translation process, and describe an algorithm for learning the parameters of this model in an unsupervised manner from a collection of documents. As we show, one can view this approach as a generalization and justi cation of the \language modeling" strategy recently proposed by Ponte and Croft. In a series of experiments on TREC data, a simple translation-based retrieval system pe...
Adam L. Berger, John D. Lafferty
Added 03 Aug 2010
Updated 03 Aug 2010
Type Conference
Year 1999
Authors Adam L. Berger, John D. Lafferty
Comments (0)