Bilingual dictionary generation for low-resourced language pairs

10 years 11 days ago
Bilingual dictionary generation for low-resourced language pairs
Bilingual dictionaries are vital resources in many areas of natural language processing. Numerous methods of machine translation require bilingual dictionaries with large coverage, but less-frequent language pairs rarely have any digitalized resources. Since the need for these resources is increasing, but the human resources are scarce for less represented languages, efficient automatized methods are needed. This paper introduces a fully automated, robust pivot language based bilingual dictionary generation method that uses the WordNet of the pivot language to build a new bilingual dictionary. We propose the usage of WordNet in order to increase accuracy; we also introduce a bidirectional selection method with a flexible threshold to maximize recall. Our evaluations showed 79% accuracy and 51% weighted recall, outperforming representative pivot language based methods. A dictionary generated with this method will still need manual post-editing, but the improved recall and precision dec...
István Varga, Shoichi Yokoyama
Added 17 Feb 2011
Updated 17 Feb 2011
Type Journal
Year 2009
Authors István Varga, Shoichi Yokoyama
Comments (0)