Sciweavers

116 search results - page 2 / 24
» Extraction of Lexical Translations from Non-Aligned Corpora
Sort
View
ICCPOL
2009
Springer
13 years 10 months ago
Constructing Parallel Corpus from Movie Subtitles
Abstract. This paper describes a methodology for constructing aligned German-Chinese corpora from movie subtitles. The corpora will be used to train a special machine translation s...
Han Xiao, Xiaojie Wang
COLING
2010
13 years 9 days ago
Extraction of Multi-word Expressions from Small Parallel Corpora
We present a general methodology for extracting multi-word expressions (of various types), along with their translations, from small parallel corpora. We automatically align the p...
Yulia Tsvetkov, Shuly Wintner
COLING
2010
13 years 9 days ago
EM-based Hybrid Model for Bilingual Terminology Extraction from Comparable Corpora
In this paper, we present an unsupervised hybrid model which combines statistical, lexical, linguistic, contextual, and temporal features in a generic EMbased framework to harvest...
Lianhau Lee, AiTi Aw, Min Zhang, Haizhou Li
ACL
2012
11 years 7 months ago
ACCURAT Toolkit for Multi-Level Alignment and Information Extraction from Comparable Corpora
The lack of parallel corpora and linguistic resources for many languages and domains is one of the major obstacles for the further advancement of automated translation. A possible...
Marcis Pinnis, Radu Ion, Dan Stefanescu, Fangzhong...
CICLING
2009
Springer
14 years 5 months ago
Exploiting Parallel Treebanks to Improve Phrase-Based Statistical Machine Translation
We use existing tools to automatically build two parallel treebanks from existing parallel corpora. We then show that combining the data extracted from both the treebanks and the ...
John Tinsley, Mary Hearne, Andy Way