Sciweavers

AI
1998
Springer

Translingual Information Retrieval: Learning from Bilingual Corpora

13 years 3 months ago
Translingual Information Retrieval: Learning from Bilingual Corpora
Translingual information retrieval (TLIR) consists of providing a query in one language and searching document collections in one or more di erent languages. This paper introduces new TLIR methods and reports on comparative TLIR experiments with these new methods and with previously reported ones in a realistic setting. Methods fall into two categories: query translation and statistical-IR approaches establishing translingual associations. The results show that using bilingual corpora for automated extraction of term equivalences in context outperforms dictionary-based methods. Translingual versions of the Generalized Vector Space Model (GVSM) and Latent Semantic Indexing (LSI) also perform well, as does translingual pseudo relevance feedback (PRF) and Example-Based Term-in-context Translation (EBT). All showed relatively small performance loss between monolingual and translingual versions, ranging between 87% to 101% of monolingualIR performance. Query translation based on a general ...
Yiming Yang, Jaime G. Carbonell, Ralf D. Brown, Ro
Added 21 Dec 2010
Updated 21 Dec 2010
Type Journal
Year 1998
Where AI
Authors Yiming Yang, Jaime G. Carbonell, Ralf D. Brown, Robert E. Frederking
Comments (0)