Recent work on distributional methods for similarity focuses on using the context in which a target word occurs to derive context-sensitive similarity computations. In this paper ...
Standard cursive handwriting recognition is based on a language model, mostly a lexicon of possible word hypotheses or character n-grams. The result is a list of word alternatives...
We propose an online topic model for sequentially analyzing the time evolution of topics in document collections. Topics naturally evolve with multiple timescales. For example, so...
The difficulty with information retrieval for OCR documents lies in the fact that OCR documents comprise of a significant amount of erroneous words and unfortunately most informat...
We develop a method for performing boolean convolutions efficiently in word RAM model of computation, having a word size of w = Ω(log n) bits, where n is the input size. The tech...