Chinese NE (Named Entity) recognition is a difficult problem because of the uncertainty in word segmentation and flexibility in language structure. This paper proposes the use of ...
Lexicalized reordering models play a crucial role in phrase-based translation systems. They are usually learned from the word-aligned bilingual corpus by examining the reordering ...
The output of a speech recognition system is not always ideal for subsequent downstream processing, in part because speakers themselves often make mistakes. A system would accompl...
Transliteration of new named entity is important for information retrieval that crosses two or multiple language. Rule-based machine transliteration is not satisfactory, since dif...
We present our efforts to create a large-scale, semi-automatically annotated parallel corpus of cleft constructions. The corpus is intended to reduce or make more effective the ma...