Sciweavers

LREC
2010

Extraction of German Multiword Expressions from Parsed Corpora Using Context Features

13 years 5 months ago
Extraction of German Multiword Expressions from Parsed Corpora Using Context Features
We report about tools for the extraction of German multiword expressions (MWEs) from text corpora; we extract word pairs, but also longer MWEs of different patterns, e.g. verb-noun structures with an additional prepositional phrase or adjective. Next to standard association-based extraction, we focus on morpho-syntactic, syntactic and lexical-choice features of the MWE candidates. A broad range of such properties (e.g. number and definiteness of nouns, adjacency of the MWE's components and their position in the sentence, preferred lexical modifiers, etc.) along with relevant example sentences, are extracted from dependency-parsed text and stored in a data base. A sample precision evaluation and an analysis of extraction errors are provided along with the discussion of our extraction architecture. We furthermore measure the contribution of the features to the precision of the extraction: by using both morpho-syntactic and syntactic features, we achieve a higher precision in the id...
Marion Weller, Ulrich Heid
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2010
Where LREC
Authors Marion Weller, Ulrich Heid
Comments (0)