After three years of work the Dutch Parallel Corpus (DPC) project has reached an end. The finalized corpus is a ten-million-word high-quality sentence-aligned bidirectional parall...
This paper1 presents an empirical approach to mining parallel corpora. Conventional approaches use a readily available collection of comparable, nonparallel corpora to extract par...
We use existing tools to automatically build two parallel treebanks from existing parallel corpora. We then show that combining the data extracted from both the treebanks and the ...