Sciweavers

IR
2008
13 years 4 months ago
Focused web crawling in the acquisition of comparable corpora
CLIR resources, such as dictionaries and parallel corpora, are scarce for special domains. Obtaining comparable corpora automatically for such domains could be an answer to this p...
Tuomas Talvensaari, Ari Pirkola, Kalervo Järv...
ACL
2006
13 years 5 months ago
Named Entity Transliteration with Comparable Corpora
In this paper we investigate ChineseEnglish name transliteration using comparable corpora, corpora where texts in the two languages deal in some of the same topics -- and therefor...
Richard Sproat, Tao Tao, ChengXiang Zhai
ACL
2006
13 years 5 months ago
Using Comparable Corpora to Solve Problems Difficult for Human Translators
In this paper we present a tool that uses comparable corpora to find appropriate translation equivalents for expressions that are considered by translators as difficult. For a phr...
Serge Sharoff, Bogdan Babych, Anthony Hartley
ECIR
2008
Springer
13 years 5 months ago
Effects of Aligned Corpus Quality and Size in Corpus-Based CLIR
Aligned corpora are often-used resources in CLIR systems. The three qualities of translation corpora that most dramatically affect the performance of a corpus-based CLIR system are...
Tuomas Talvensaari
LREC
2010
169views Education» more  LREC 2010»
13 years 5 months ago
Using Comparable Corpora to Adapt a Translation Model to Domains
Statistical machine translation (SMT) requires a large parallel corpus, which is available only for restricted language pairs and domains. To expand the language pairs and domains...
Hiroyuki Kaji, Takashi Tsunakawa, Daisuke Okada
ACL
2007
13 years 5 months ago
Assisting Translators in Indirect Lexical Transfer
We present the design and evaluation of a translator’s amenuensis that uses comparable corpora to propose and rank nonliteral solutions to the translation of expressions from th...
Bogdan Babych, Anthony Hartley, Serge Sharoff, Olg...
IRAL
2003
ACM
13 years 9 months ago
Learning bilingual translations from comparable corpora to cross-language information retrieval: hybrid statistics-based and lin
Recent years saw an increased interest in the use and the construction of large corpora. With this increased interest and awareness has come an expansion in the application to kno...
Fatiha Sadat, Masatoshi Yoshikawa, Shunsuke Uemura
EACL
2009
ACL Anthology
14 years 4 months ago
On the Use of Comparable Corpora to Improve SMT performance
Sadaf Abdul-Rauf, Holger Schwenk