Sciweavers

735 search results - page 7 / 147
» Corpora and data preparation
Sort
View
LREC
2010
165views Education» more  LREC 2010»
14 years 11 months ago
Data Collection and IPR in Multilingual Parallel Corpora. Dutch Parallel Corpus
After three years of work the Dutch Parallel Corpus (DPC) project has reached an end. The finalized corpus is a ten-million-word high-quality sentence-aligned bidirectional parall...
Orphée De Clercq, Maribel Montero Perez
DAARC
2007
Springer
87views Algorithms» more  DAARC 2007»
15 years 4 months ago
Using Very Large Parsed Corpora and Judgment Data to Classify Verb Reflexivity
Erik-Jan Smits, Petra Hendriks, Jennifer Spenader
COLING
2010
14 years 4 months ago
An Empirical Study on Web Mining of Parallel Data
This paper1 presents an empirical approach to mining parallel corpora. Conventional approaches use a readily available collection of comparable, nonparallel corpora to extract par...
Gum-Won Hong, Chi-Ho Li, Ming Zhou, Hae-Chang Rim
CICLING
2009
Springer
15 years 10 months ago
Exploiting Parallel Treebanks to Improve Phrase-Based Statistical Machine Translation
We use existing tools to automatically build two parallel treebanks from existing parallel corpora. We then show that combining the data extracted from both the treebanks and the ...
John Tinsley, Mary Hearne, Andy Way