Inducing Tree-Substitution Grammars

12 years 18 days ago
Inducing Tree-Substitution Grammars
Inducing a grammar from text has proven to be a notoriously challenging learning task despite decades of research. The primary reason for its difficulty is that in order to induce plausible grammars, the underlying model must be capable of representing the intricacies of language while also ensuring that it can be readily learned from data. The majority of existing work on grammar induction has favoured model simplicity (and thus learnability) over representational capacity by using context free grammars and first order dependency grammars, which are not sufficiently expressive to model many common linguistic constructions. We propose a novel compromise by inferring a probabilistic tree substitution grammar, a formalism which allows for arbitrarily large tree fragments and thereby better represent complex linguistic structures. To limit the model's complexity we employ a Bayesian non-parametric prior which biases the model towards a sparse grammar with shallow productions. We dem...
Trevor Cohn, Phil Blunsom, Sharon Goldwater
Added 19 May 2011
Updated 19 May 2011
Type Journal
Year 2010
Where JMLR
Authors Trevor Cohn, Phil Blunsom, Sharon Goldwater
Comments (0)