This paper presents a general platform, namely synchronous tree sequence substitution grammar (STSSG), for the grammar comparison study in Translational Equivalence Modeling (TEM)...
Min Zhang, Hongfei Jiang, Haizhou Li, AiTi Aw, She...
Typical statistical machine translation systems are trained with static parallel corpora. Here we account for scenarios with a continuous incoming stream of parallel training data...
Abby Levenberg, Chris Callison-Burch, Miles Osborn...
Compounded words are a challenge for NLP applications such as machine translation (MT). We introduce methods to learn splitting rules from monolingual and parallel corpora. We eva...
In this paper, we propose a new learning method for extracting bilingual word pairs from parallel corpora in various languages. In cross-language information retrieval, the system...
CLIR resources, such as dictionaries and parallel corpora, are scarce for special domains. Obtaining comparable corpora automatically for such domains could be an answer to this p...