In this paper we report a way of constructing a translation corpus that contains not only source and target texts, but draft and final versions of target texts, through the transl...
Cross-language latent semantic indexing is a method that learns useful languageindependent vector representations of terms through a statistical analysis of a documentaligned text...
Abstract. Our work analyzes the usefulness of microblogging in second language learning using the example of the social network Twitter. Most learners of English do not require eve...
Data sparseness is one of the factors that degrade statistical machine translation (SMT). Existing work has shown that using morphosyntactic information is an effective solution t...
Factored Statistical Machine Translation extends the Phrase Based SMT model by allowing each word to be a vector of factors. Experiments have shown effectiveness of many factors, ...