Sciweavers

FSMNLP
2005
Springer

TAGH: A Complete Morphology for German Based on Weighted Finite State Automata

13 years 9 months ago
TAGH: A Complete Morphology for German Based on Weighted Finite State Automata
TAGH is a system for automatic recognition of German word forms. It is based on a stem lexicon with allomorphs and a concatenative mechanism for inflection and word formation. Weighted FSA and a cost function are used in order to determine the correct segmentation of complex forms: the correct segmentation for a given compound is supposed to be the one with the least cost. TAGH is based on a large stem lexicon of almost 80.000 stems that was compiled within 5 years on the basis of large newspaper corpora and literary texts. The number of analyzable word forms is increased considerably by more than 1000 different rules for derivational and compositional word formation. The recognition rate of TAGH is more than 99% for modern newspaper text and approximately 98.5% for literary texts.
Alexander Geyken, Thomas Hanneforth
Added 27 Jun 2010
Updated 27 Jun 2010
Type Conference
Year 2005
Where FSMNLP
Authors Alexander Geyken, Thomas Hanneforth
Comments (0)