This paper deals with the uses of the annotations of third person singular neuter pronouns in the DAD parallel and comparable corpora of Danish and Italian texts and spoken data. ...
The Arabic language has a very rich morphology where a word is composed of zero or more prefixes, a stem and zero or more suffixes. This makes Arabic data sparse compared to other...
Recent research has made significant advances in automatically constructing knowledge bases by extracting relational facts (e.g., Bill Clinton-presidentOf-US) from large text cor...
Partha Pratim Talukdar, Derry Tanti Wijaya, Tom Mi...
We present results of probabilistic tagging of Czech texts in order to show how these techniques work for one of the highly morphologically ambiguous inflective languages. After d...
Most of the Indian scripts do not have any robust commercial OCRs. Many of the laboratory prototypes report reasonable results at recognition/classification stage. However, word ...