Learning bilingual translations from comparable corpora to cross-language information retrieval: hybrid statistics-based and lin

15 years 6 months ago

Download acl.ldc.upenn.edu

Recent years saw an increased interest in the use and the construction of large corpora. With this increased interest and awareness has come an expansion in the application to knowledge acquisition and bilingual terminology extraction. The present paper will seek to present an approach to bilingual lexicon extraction from non-aligned comparable corpora, combination to linguisticsbased pruning and evaluations on CrossLanguage Information Retrieval. We propose and explore a two-stages translation model for the acquisition of bilingual terminology from comparable corpora, disambiguation and selection of best translation alternatives on the basis of their morphological knowledge. Evaluations using a large-scale test collection on JapaneseEnglish and different weighting schemes of SMART retrieval system conﬁrmed the effectiveness of the proposed combination of two-stages comparable corpora and linguistics-based pruning on CrossLanguage Information Retrieval.

Fatiha Sadat, Masatoshi Yoshikawa, Shunsuke Uemura

Real-time Traffic

Bilingual Terminology | Comparable Corpora | CrossLanguage Information Retrieval | Information Retrieval | IRAL 2003 |

claim paper

Added	05 Jul 2010
Updated	05 Jul 2010
Type	Conference
Year	2003
Where	IRAL
Authors	Fatiha Sadat, Masatoshi Yoshikawa, Shunsuke Uemura

Sciweavers

Learning bilingual translations from comparable corpora to cross-language information retrieval: hybrid statistics-based and lin

Bilingual Terminology | Comparable Corpora | CrossLanguage Information Retrieval | Information Retrieval | IRAL 2003 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers