Sciweavers

NAACL
2010

From Baby Steps to Leapfrog: How "Less is More" in Unsupervised Dependency Parsing

13 years 1 months ago
From Baby Steps to Leapfrog: How "Less is More" in Unsupervised Dependency Parsing
We present three approaches for unsupervised grammar induction that are sensitive to data complexity and apply them to Klein and Manning's Dependency Model with Valence. The first, Baby Steps, bootstraps itself via iterated learning of increasingly longer sentences and requires no initialization. This method substantially exceeds Klein and Manning's published scores and achieves 39.4% accuracy on Section 23 (all sentences) of the Wall Street Journal corpus. The second, Less is More, uses a low-complexity subset of the available data: sentences up to length 15. Focusing on fewer but simpler examples trades off quantity against ambiguity; it attains 44.1% accuracy, using the standard linguisticallyinformed prior and batch training, beating state-of-the-art. Leapfrog, our third heuristic, combines Less is More with Baby Steps by mixing their models of shorter sentences, then rapidly ramping up exposure to the full training set, driving up accuracy to 45.0%. These trends general...
Valentin I. Spitkovsky, Hiyan Alshawi, Daniel Jura
Added 14 Feb 2011
Updated 14 Feb 2011
Type Journal
Year 2010
Where NAACL
Authors Valentin I. Spitkovsky, Hiyan Alshawi, Daniel Jurafsky
Comments (0)