Sciweavers

ACL
2006

QuestionBank: Creating a Corpus of Parse-Annotated Questions

13 years 6 months ago
QuestionBank: Creating a Corpus of Parse-Annotated Questions
This paper describes the development of QuestionBank, a corpus of 4000 parseannotated questions for (i) use in training parsers employed in QA, and (ii) evaluation of question parsing. We present a series of experiments to investigate the effectiveness of QuestionBank as both an exclusive and supplementary training resource for a state-of-the-art parser in parsing both question and non-question test sets. We introduce a new method for recovering empty nodes and their antecedents (capturing long distance dependencies) from parser output in CFG trees using LFG f-structure reentrancies. Our main findings are (i) using QuestionBank training data improves parser performance to 89.75% labelled bracketing f-score, an increase of almost 11% over the baseline; (ii) back-testing experiments on nonquestion data (Penn-II WSJ Section 23) shows that the retrained parser does not suffer a performance drop on non-question material; (iii) ablation experiments show that the size of training material pr...
John Judge, Aoife Cahill, Josef van Genabith
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2006
Where ACL
Authors John Judge, Aoife Cahill, Josef van Genabith
Comments (0)