From Baby Steps to Leapfrog: How "Less is More" in Unsupervised Dependency Parsing

14 years 9 months ago

Download www.stanford.edu

We present three approaches for unsupervised grammar induction that are sensitive to data complexity and apply them to Klein and Manning's Dependency Model with Valence. The first, Baby Steps, bootstraps itself via iterated learning of increasingly longer sentences and requires no initialization. This method substantially exceeds Klein and Manning's published scores and achieves 39.4% accuracy on Section 23 (all sentences) of the Wall Street Journal corpus. The second, Less is More, uses a low-complexity subset of the available data: sentences up to length 15. Focusing on fewer but simpler examples trades off quantity against ambiguity; it attains 44.1% accuracy, using the standard linguisticallyinformed prior and batch training, beating state-of-the-art. Leapfrog, our third heuristic, combines Less is More with Baby Steps by mixing their models of shorter sentences, then rapidly ramping up exposure to the full training set, driving up accuracy to 45.0%. These trends general...

Valentin I. Spitkovsky, Hiyan Alshawi, Daniel Jura

Real-time Traffic

Baby Steps | Computational Linguistics | Data Complexity | NAACL 2010 | Unsupervised Grammar Induction |

claim paper

Post Info
More Details (n/a)

Added	14 Feb 2011
Updated	14 Feb 2011
Type	Journal
Year	2010
Where	NAACL
Authors	Valentin I. Spitkovsky, Hiyan Alshawi, Daniel Jurafsky

Comments (0)

Sciweavers

From Baby Steps to Leapfrog: How "Less is More" in Unsupervised Dependency Parsing

Baby Steps | Computational Linguistics | Data Complexity | NAACL 2010 | Unsupervised Grammar Induction |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers