Data--call records, internet packet headers, or other transaction records--are coming down a pipe at a ferocious rate, and we need to monitor statistics of the data. There is no r...
In this paper, we formally define the problem of topic modeling with network structure (TMN). We propose a novel solution to this problem, which regularizes a statistical topic mo...
In large scale online systems like Search, eCommerce, or social network applications, user queries represent an important dimension of activities that can be used to study the imp...
The web contains lots of interesting factual information about entities, such as celebrities, movies or products. This paper describes a robust bootstrapping approach to corrobora...
Classification is a well-established operation in text mining. Given a set of labels A and a set DA of training documents tagged with these labels, a classifier learns to assign l...