Sciweavers

13 search results - page 1 / 3
» Data Collection and IPR in Multilingual Parallel Corpora. Du...
Sort
View
LREC
2010
165views Education» more  LREC 2010»
13 years 6 months ago
Data Collection and IPR in Multilingual Parallel Corpora. Dutch Parallel Corpus
After three years of work the Dutch Parallel Corpus (DPC) project has reached an end. The finalized corpus is a ten-million-word high-quality sentence-aligned bidirectional parall...
Orphée De Clercq, Maribel Montero Perez
LREC
2010
168views Education» more  LREC 2010»
13 years 6 months ago
Balancing SoNaR: IPR versus Processing Issues in a 500-Million-Word Written Dutch Reference Corpus
In The Low Countries, a major reference corpus for written Dutch is currently being built. In this paper, we discuss the interplay between data acquisition and data processing dur...
Martin Reynaert, Nelleke Oostdijk, Orphée D...
ECIR
2006
Springer
13 years 6 months ago
Automatic Acquisition of Chinese-English Parallel Corpus from the Web
Parallel corpora are a valuable resource for tasks such as cross-language information retrieval and data-driven natural language processing systems. Previously only small scale cor...
Ying Zhang, Ke Wu, Jianfeng Gao, Phil Vines
ACL
2009
13 years 3 months ago
Active Learning for Multilingual Statistical Machine Translation
Statistical machine translation (SMT) models require bilingual corpora for training, and these corpora are often multilingual with parallel text in multiple languages simultaneous...
Gholamreza Haffari, Anoop Sarkar
AIRS
2004
Springer
13 years 10 months ago
Multilingual Relevant Sentence Detection Using Reference Corpus
IR with reference corpus is one approach when dealing with relevant sentences detection, which takes the result of IR as the representation of query (sentence). Lack of informatio...
Ming-Hung Hsu, Ming-Feng Tsai, Hsin-Hsi Chen