In OCR systems the character segmentation algorithm may generate mis-segmented blocks. Feedback information from character classifier is indispensable to achieve higher character ...
In this paper, we propose forest-to-string rules to enhance the expressive power of tree-to-string translation models. A forestto-string rule is capable of capturing nonsyntactic ...
We present in this paper methods to improve HMM-based part-of-speech (POS) tagging of Mandarin. We model the emission probability of an unknown word using all the characters in th...
Handheld devices generally provide the facility of text input through keys that are an inconvenient and slow way of input. Digitizing tablets and light pens, on the other hand pro...
We investigate generalizations of the allsubtrees "DOP" approach to unsupervised parsing. Unsupervised DOP models assign all possible binary trees to a set of sentences ...