This paper describes a high performance sampling architecture for inference of latent topic models on a cluster of workstations. Our system is faster than previous work by over an...
Analyzing sequence data has become increasingly important recently in the area of biological sequences, text documents, web access logs, etc. In this paper, we investigate the pro...
: We believe that much novel insight into the worldwide web can be obtained from taking into account the important fact that it is created, used, and run by selfish optimizing agen...
Georgios Kouroupas, Elias Koutsoupias, Christos H....
Entity detection and tracking is a relatively new addition to the repertoire of natural language tasks. In this paper, we present a statistical language-independent framework for ...
Radu Florian, Hany Hassan, Abraham Ittycheriah, Ho...
Most text mining methods are based on representing documents using a vector space model, commonly known as a bag of word model, where each document is modeled as a linear vector r...
Rowena Chau, Ah Chung Tsoi, Markus Hagenbuchner, V...