Parallel corpora are critical resources for machine translation research and development since parallel corpora contain translation equivalences of various granularities. Manual a...
We study several problems related to finding reset words in deterministic finite automata. In particular, we establish that the problem of deciding whether a shortest reset word ha...
Abstract. An efficient low-level word image representation plays a crucial role in general cursive word recognition. This paper proposes a novel representation scheme, where a word...
Jian-xiong Dong, Adam Krzyzak, Ching Y. Suen, Domi...
Standard approaches to Chinese word segmentation treat the problem as a tagging task, assigning labels to the characters in the sequence indicating whether the character marks a w...
Disfluent speech adds to the difficulty of processing spoken language utterances. In this paper we concentrate on identifying one disfluency phenomenon: fragmented words. Our d...