Sciweavers

ACL
2000

Distribution-Based Pruning of Backoff Language Models

13 years 6 months ago
Distribution-Based Pruning of Backoff Language Models
We propose a distribution-based pruning of n-gram backoff language models. Instead of the conventional approach of pruning n-grams that are infrequent in training data, we prune n-grams that are likely to be infrequent in a new document. Our method is based on the n-gram distribution i.e. the probability that an n-gram occurs in a new document. Experimental results show that our method performed 7-9% (word perplexity reduction) better than conventional cutoff methods.
Jianfeng Gao, Kai-Fu Lee
Added 01 Nov 2010
Updated 01 Nov 2010
Type Conference
Year 2000
Where ACL
Authors Jianfeng Gao, Kai-Fu Lee
Comments (0)