Web-search queries are known to be short, but little else is known about their structure. In this paper we investigate the applicability of part-of-speech tagging to typical Engli...
Speech recognition in many morphologically rich languages suffers from a very high out-of-vocabulary (OOV) ratio. Earlier work has shown that vocabulary decomposition methods can ...
In this paper, we present CaptionEye/KE, a Korean to English machine translation system that is applied to a practical TV caption translation. And its experimental evaluation is p...
Seong-il Yang, Young Kil Kim, Young Ae Seo, Sung-K...
It has previously been assumed in the psycholinguistic literature that finite-state models of language are crucially limited in their explanatory power by the locality of the prob...
The Online Database of Interlinear Text (ODIN)1 is a database of interlinear text "snippets", harvested mostly from scholarly documents posted to the Web. Although large...