Sciweavers

TREC
2000

TREC-9 CLIR at CUHK: Disambiguation by Similarity Values Between Adjacent Words

13 years 5 months ago
TREC-9 CLIR at CUHK: Disambiguation by Similarity Values Between Adjacent Words
We investigated the dictionary-based query translation method combining the translation disambiguation process using statistic cooccurrence information trained from the provided corpus. We believe that neighboring words tend to be related in contextual meaning and have higher chance of co-occurrence particularly if adjacent words (two or more) compose a phrase. The correct translation equivalents of co-occurrence pattern in a source language are more likely to co-occur in a target language documents than in conjunction with any incorrect translation equivalents within a certain range of contextual window size. In this work, we tested several methods to calculate the degree of co-occurrence and used them as the basis of disambiguation. Different from most disambiguation methods which usually select one best translation equivalent for a word, we select the best translation equivalent pairs for two adjacent words. The final translated queries are the concatenation of all overlapped adjac...
Honglan Jin, Kam-Fai Wong
Added 01 Nov 2010
Updated 01 Nov 2010
Type Conference
Year 2000
Where TREC
Authors Honglan Jin, Kam-Fai Wong
Comments (0)