Sciweavers

ACL
2001

XML-Based Data Preparation for Robust Deep Parsing

13 years 6 months ago
XML-Based Data Preparation for Robust Deep Parsing
We describe the use of XML tokenisation, tagging and mark-up tools to prepare a corpus for parsing. Our techniques are generally applicable but here on parsing Medline abstracts with the ANLT wide-coverage grammar. Hand-crafted grammars inevitably lack coverage but many coverage failures are due to inadequacies of their lexicons. We describe a method of gaining a degree of robustness by interfacing POS tag information with the existing lexicon. We also show that XML tools provide a sophisticated approach to pre-processing, helping to ameliorate the `messiness' in real language data and improve parse performance.
Claire Grover, Alex Lascarides
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2001
Where ACL
Authors Claire Grover, Alex Lascarides
Comments (0)