This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. ...
Kamal Nigam, Andrew McCallum, Sebastian Thrun, Tom...
We consider the problem of learning a Riemannian metric associated with a given differentiable manifold and a set of points. Our approach to the problem involves choosing a metric...
We propose to solve a text categorization task using a new metric between documents, based on a priori semantic knowledge about words. This metric can be incorporated into the def...
With the rapid growth of the Internet, most of the textual data in the form of newspapers, magazines and journals tend to be available on-line. Summarizing these texts can aid the...
P. Arun Kumar, K. Praveen Kumar, T. Someswara Rao,...
This paper describes an investigation of authorship gender attribution mining from e-mail text documents. We used an extended set of predominantly topic content-free e-mail docume...
Malcolm Corney, Olivier Y. de Vel, Alison Anderson...