Most of the Indian scripts do not have any robust commercial OCRs. Many of the laboratory prototypes report reasonable results at recognition/classification stage. However, word ...
tion Abstract ChengXiang Zhai (Advisor: John Lafferty) Language Technologies Institute School of Computer Science Carnegie Mellon University With the dramatic increase in online in...
Statistical machine translation systems are usually trained on large amounts of bilingual text (used to learn a translation model), and also large amounts of monolingual text in th...
Entity detection and tracking is a relatively new addition to the repertoire of natural language tasks. In this paper, we present a statistical language-independent framework for ...
Radu Florian, Hany Hassan, Abraham Ittycheriah, Ho...
We demonstrate a new research approach to the problem of predicting the reading difficulty of a text passage, by recasting readability in terms of statistical language modeling. W...