Sciweavers

25 search results - page 1 / 5
» Extracting Parallel Sentences from Comparable Corpora using ...
Sort
View
NAACL
2010
13 years 2 months ago
Extracting Parallel Sentences from Comparable Corpora using Document Level Alignment
The quality of a statistical machine translation (SMT) system is heavily dependent upon the amount of parallel sentences used in training. In recent years, there have been several...
Jason R. Smith, Chris Quirk, Kristina Toutanova
ACL
2012
11 years 6 months ago
ACCURAT Toolkit for Multi-Level Alignment and Information Extraction from Comparable Corpora
The lack of parallel corpora and linguistic resources for many languages and domains is one of the major obstacles for the further advancement of automated translation. A possible...
Marcis Pinnis, Radu Ion, Dan Stefanescu, Fangzhong...
COLING
2010
12 years 11 months ago
An Empirical Study on Web Mining of Parallel Data
This paper1 presents an empirical approach to mining parallel corpora. Conventional approaches use a readily available collection of comparable, nonparallel corpora to extract par...
Gum-Won Hong, Chi-Ho Li, Ming Zhou, Hae-Chang Rim
LREC
2008
109views Education» more  LREC 2008»
13 years 6 months ago
Creating Sentence-Aligned Parallel Text Corpora from a Large Archive of Potential Parallel Text using BITS and Champollion
Parallel text is one of the most valuable resources for development of statistical machine translation systems and other NLP applications. The Linguistic Data Consortium (LDC) has...
Kazuaki Maeda, Xiaoyi Ma, Stephanie Strassel
ACL
2006
13 years 5 months ago
Extracting Parallel Sub-Sentential Fragments from Non-Parallel Corpora
We present a novel method for extracting parallel sub-sentential fragments from comparable, non-parallel bilingual corpora. By analyzing potentially similar sentence pairs using a...
Dragos Stefan Munteanu, Daniel Marcu