Uptraining for Accurate Deterministic Question Parsing

14 years 8 months ago

Download www.petrovi.de

It is well known that parsing accuracies drop significantly on out-of-domain data. What is less known is that some parsers suffer more from domain shifts than others. We show that dependency parsers have more difficulty parsing questions than constituency parsers. In particular, deterministic shift-reduce dependency parsers, which are of highest interest for practical applications because of their linear running time, drop to 60% labeled accuracy on a question test set. We propose an uptraining procedure in which a deterministic parser is trained on the output of a more accurate, but slower, latent variable constituency parser (converted to dependencies). Uptraining with 100K unlabeled questions achieves results comparable to having 2K labeled questions for training. With 100K unlabeled and 2K labeled questions, uptraining is able to improve parsing accuracy to 84%, closing the gap between in-domain and out-of-domain performance.

Slav Petrov, Pi-Chuan Chang, Michael Ringgaard, Hi

Real-time Traffic

Constituency Parser | EMNLP 2010 | Natural Language Processing | Parser | Questions |

claim paper

Post Info
More Details (n/a)

Added	11 Feb 2011
Updated	11 Feb 2011
Type	Journal
Year	2010
Where	EMNLP
Authors	Slav Petrov, Pi-Chuan Chang, Michael Ringgaard, Hiyan Alshawi

Comments (0)

Sciweavers

Uptraining for Accurate Deterministic Question Parsing

Constituency Parser | EMNLP 2010 | Natural Language Processing | Parser | Questions |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers