We propose a new probabilistic approach to information retrieval based upon the ideas and methods of statistical machine translation. The central ingredient in this approach is a ...
We investigate an automatic method for Cross Language Information Retrieval (CLIR) that utilizes the multilingual UMLS Metathesaurus to translate Spanish and French natural languag...
David Eichmann, Miguel E. Ruiz, Padmini Srinivasan
We investigate the connection between part of speech (POS) distribution and content in language. We define POS blocks to be groups of parts of speech. We hypothesise that there ex...
The difficulty with information retrieval for OCR documents lies in the fact that OCR documents comprise of a significant amount of erroneous words and unfortunately most informat...
The impact of using phrases as content representation for documents and for queries has generally been accepted as a desirable feature in information retrieval systems because phr...