Sciweavers

EMNLP
2010

Uptraining for Accurate Deterministic Question Parsing

13 years 2 months ago
Uptraining for Accurate Deterministic Question Parsing
It is well known that parsing accuracies drop significantly on out-of-domain data. What is less known is that some parsers suffer more from domain shifts than others. We show that dependency parsers have more difficulty parsing questions than constituency parsers. In particular, deterministic shift-reduce dependency parsers, which are of highest interest for practical applications because of their linear running time, drop to 60% labeled accuracy on a question test set. We propose an uptraining procedure in which a deterministic parser is trained on the output of a more accurate, but slower, latent variable constituency parser (converted to dependencies). Uptraining with 100K unlabeled questions achieves results comparable to having 2K labeled questions for training. With 100K unlabeled and 2K labeled questions, uptraining is able to improve parsing accuracy to 84%, closing the gap between in-domain and out-of-domain performance.
Slav Petrov, Pi-Chuan Chang, Michael Ringgaard, Hi
Added 11 Feb 2011
Updated 11 Feb 2011
Type Journal
Year 2010
Where EMNLP
Authors Slav Petrov, Pi-Chuan Chang, Michael Ringgaard, Hiyan Alshawi
Comments (0)