Topic models are a useful tool for analyzing large text collections, but have previously been applied in only monolingual, or at most bilingual, contexts. Meanwhile, massive colle...
David M. Mimno, Hanna M. Wallach, Jason Naradowsky...
With the rapid emergence and proliferation of Internet and the trend of globalization, a tremendous amount of textual documents written in different languages are electronically ac...
As Chinese text is written without word boundaries, effectively recognizing Chinese words is like recognizing collocations in English, substituting characters for words and words ...
This paper presents the Topic-Aspect Model (TAM), a Bayesian mixture model which jointly discovers topics and aspects. We broadly define an aspect of a document as a characteristi...
In this paper, we propose a PLSA-based language model for sports live speech. This model is implemented in unigram rescaling technique that combines a topic model and an n-gram. I...