Syntactic Topic Models

12 years 1 months ago
Syntactic Topic Models
We develop the syntactic topic model (STM), a nonparametric Bayesian model of parsed documents. The STM generates words that are both thematically and syntactically constrained, which combines the semantic insights of topic models with the syntactic information available from parse trees. Each word of a sentence is generated by a distribution that combines document-specific topic weights and parse-tree-specific syntactic transitions. Words are assumed to be generated in an order that respects the parse tree. We derive an approximate posterior inference method based on variational methods for hierarchical Dirichlet processes, and we report qualitative and quantitative results on both synthetic data and hand-parsed documents.
Jordan L. Boyd-Graber, David M. Blei
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where NIPS
Authors Jordan L. Boyd-Graber, David M. Blei
Comments (0)