Building a Cross-lingual Relatedness Thesaurus using a Graph Similarity Measure

13 years 10 months ago

Download www.lrec-conf.org

The Internet is an ever growing source of information stored in documents of different languages. Hence, cross-lingual resources are needed for more and more NLP applications. This paper presents (i) a graph-based method for creating one such resource and (ii) a resource created using the method, a cross-lingual relatedness thesaurus. Given a word in one language, the thesaurus suggests words in a second language that are semantically related. The method requires two monolingual corpora and a basic dictionary. Our general approach is to build two monolingual word graphs, with nodes representing words and edges representing linguistic relations between words. A bilingual dictionary containing basic vocabulary provides seed translations relating nodes from both graphs. We then use an inter-graph node-similarity algorithm to discover related words. Evaluation with three human judges revealed that 49% of the English and 57% of the German words discovered by our method are semantically rel...

Lukas Michelbacher, Florian Laws, Beate Dorow, Ulr

Real-time Traffic

Cross-lingual Relatedness Thesaurus | Education | LREC 2010 | Nodes Representing Words | Thesaurus Suggests Words |

claim paper

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2010
Where	LREC
Authors	Lukas Michelbacher, Florian Laws, Beate Dorow, Ulrich Heid, Hinrich Schütze

Sciweavers

Building a Cross-lingual Relatedness Thesaurus using a Graph Similarity Measure

Cross-lingual Relatedness Thesaurus | Education | LREC 2010 | Nodes Representing Words | Thesaurus Suggests Words |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers