Sciweavers

EMNLP
2007

Compressing Trigram Language Models With Golomb Coding

13 years 5 months ago
Compressing Trigram Language Models With Golomb Coding
Trigram language models are compressed using a Golomb coding method inspired by the original Unix spell program. Compression methods trade off space, time and accuracy (loss). The proposed HashTBO method optimizes space at the expense of time and accuracy. Trigram language models are normally considered memory hogs, but with HashTBO, it is possible to squeeze a trigram language model into a few megabytes or less. HashTBO made it possible to ship a trigram contextual speller in Microsoft Office 2007.
Kenneth Church, Ted Hart, Jianfeng Gao
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2007
Where EMNLP
Authors Kenneth Church, Ted Hart, Jianfeng Gao
Comments (0)