Shannon's Noisy-Channel model, which describes how a corrupted message might be reconstructed, has been the corner stone for much work in statistical language and speech proc...
This paper describes a framework for defining domain specific Feature Functions in a user friendly form to be used in a Maximum Entropy Markov Model (MEMM) for the Named Entity Re...
The expectation maximization (EM) algorithm is a widely used maximum likelihood estimation procedure for statistical models when the values of some of the variables in the model a...
Truecasing is the process of restoring case information to badly-cased or noncased text. This paper explores truecasing issues and proposes a statistical, language modeling based ...
Lucian Vlad Lita, Abraham Ittycheriah, Salim Rouko...
The Dublin City University participation in the CLEF 2006 CL-SR task concentrated on exploring the combination of the multiple fields associated with the documents. This was based...
Gareth J. F. Jones, Ke Zhang, Adenike M. Lam-Adesi...