Unsupervised learning algorithms have been derived for several statistical models of English grammar, but their computational complexity makes applying them to large data sets int...
Sense annotation and lexicon building are costly affairs demanding prudent investment of resources. Recent work on multilingual WSD has shown that it is possible to leverage the a...
Mitesh M. Khapra, Saurabh Sohoney, Anup Kulkarni, ...
Using a lexicon can often improve character recognition under challenging conditions, such as poor image quality or unusual fonts. We propose a flexible probabilistic model for c...
Jerod J. Weinman, Erik G. Learned-Miller, Allen R....
Latent topic models have been successfully applied as an unsupervised topic discovery technique in large document collections. With the proliferation of hypertext document collect...
Some models of textual corpora employ text generation methods involving n-gram statistics, while others use latent topic variables inferred using the "bag-of-words" assu...