Sciweavers

ACL
2006

A Hierarchical Bayesian Language Model Based On Pitman-Yor Processes

13 years 6 months ago
A Hierarchical Bayesian Language Model Based On Pitman-Yor Processes
We propose a new hierarchical Bayesian n-gram model of natural languages. Our model makes use of a generalization of the commonly used Dirichlet distributions called Pitman-Yor processes which produce power-law distributions more closely resembling those in natural languages. We show that an approximation to the hierarchical Pitman-Yor language model recovers the exact formulation of interpolated Kneser-Ney, one of the best smoothing methods for n-gram language models. Experiments verify that our model gives cross entropy results superior to interpolated Kneser-Ney and comparable to modified Kneser-Ney.
Yee Whye Teh
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2006
Where ACL
Authors Yee Whye Teh
Comments (0)