In this paper, we address a relatively new and interesting text categorization problem: classify a political blog as either liberal or conservative, based on its political leaning...
To exploit the similarity information hidden in the hyperlink structure of the web, this paper introduces algorithms scalable to graphs with billions of vertices on a distributed ...
Machine learning and data mining can be effectively used to model, classify and discover interesting information for a wide variety of data including email. The Email Mining Toolk...
In this paper, we formally define the problem of topic modeling with network structure (TMN). We propose a novel solution to this problem, which regularizes a statistical topic mo...
Curating information from the literature for storage in databases is a crucial task in biological research, and many groups assign a particular person or team to that process. We ...