The difficulty with information retrieval for OCR documents lies in the fact that OCR documents comprise of a significant amount of erroneous words and unfortunately most informat...
This paper presents a new discriminative model for information retrieval (IR), referred to as linear discriminant model (LDM), which provides a flexible framework to incorporate a...
We propose a new probabilistic approach to information retrieval based upon the ideas and methods of statistical machine translation. The central ingredient in this approach is a ...