Decision lists (or ordered rule sets) have two attractive properties compared to unordered rule sets: they require a simpler classification procedure and they allow for a more co...
The discovery of characteristic rules is a well-known data mining task and has lead to several successful applications. However, because of the descriptive nature of characteristic...
Retweeting is an important action (behavior) on Twitter, indicating the behavior that users re-post microblogs of their friends. While much work has been conducted for mining text...
Zi Yang, Jingyi Guo, Keke Cai, Jie Tang, Juanzi Li...
Abstract. In this paper, we present an extensive study of the cuttingplane algorithm (CPA) applied to structural kernels for advanced text classification on large datasets. In par...
We present collaborative peer-to-peer algorithms for the problem of approximating frequency counts for popular items distributed across the peers of a large-scale network. Our alg...