Sciweavers

EACL
2009
ACL Anthology

Enhancing Unlexicalized Parsing Performance Using a Wide Coverage Lexicon, Fuzzy Tag-Set Mapping, and EM-HMM-Based Lexical Proba

14 years 5 months ago
Enhancing Unlexicalized Parsing Performance Using a Wide Coverage Lexicon, Fuzzy Tag-Set Mapping, and EM-HMM-Based Lexical Proba
We present a framework for interfacing a PCFG parser with lexical information from an external resource following a different tagging scheme than the treebank. This is achieved by defining a stochastic mapping layer between the two resources. Lexical probabilities for rare events are estimated in a semi-supervised manner from a lexicon and large unannotated corpora. We show that this solution greatly enhances the performance of an unlexicalized Hebrew PCFG parser, resulting in state-of-the-art Hebrew parsing results both when a segmentation oracle is assumed, and in a real-word parsing scenario of parsing unsegmented tokens.
Yoav Goldberg, Reut Tsarfaty, Meni Adler, Michael
Added 24 Nov 2009
Updated 24 Nov 2009
Type Conference
Year 2009
Where EACL
Authors Yoav Goldberg, Reut Tsarfaty, Meni Adler, Michael Elhadad
Comments (0)