Taxonomies of the Web typically have hundreds of thousands of categories and skewed category distribution over documents. It is not clear whether existing text classification tech...
Tie-Yan Liu, Yiming Yang, Hao Wan, Qian Zhou, Bin ...
This paper proposes a new parallel execution model where programmers augment a sequential program with pieces of code called serializers that dynamically map computational operati...
Matthew D. Allen, Srinath Sridharan, Gurindar S. S...
: We combine the speed and scalability of information retrieval with the generally superior classification accuracy offered by machine learning, yielding a two-phase text classifie...
We propose a new algorithm for dimensionality reduction and unsupervised text classification. We use mixture models as underlying process of generating corpus and utilize a novel,...
Abstract. In this paper we describe a virtual laboratory that is designed to accelerate scientific exploration and discovery by minimizing the time between the generation of a scie...
Judith Ellen Devaney, Steven G. Satterfield, John ...