For resource-limited language pairs, coverage of the test set by the parallel corpus is an important factor that affects translation quality in two respects: 1) out of vocabulary ...
Recent text and speech processing applications such as speech mining raise new and more general problems related to the construction of language models. We present and describe in...
Documents in languages such as Chinese, Japanese and Korean sometimes annotate terms with their translations in English inside a pair of parentheses. We present a method to extrac...
Dekang Lin, Shaojun Zhao, Benjamin Van Durme, Mari...
A D-polyhedron is a polyhedron P such that if x, y are in P then so are their componentwise max and min. In other words, the point set of a D-polyhedron forms a distributive latti...
I argue that because of spelling and typing errors and other properties of typed text, the identification of words and word boundaries in general requires syntactic and semantic k...