This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. ...
Kamal Nigam, Andrew McCallum, Sebastian Thrun, Tom...
We consider classification of email messages as to whether or not they contain certain “email acts”, such as a request or a commitment. We show that exploiting the sequential ...
A new class of data structures called "bumptrees" is described. These structures are useful for efficiently implementing a number of neural network related operations. A...
Large-scale logistic regression arises in many applications such as document classification and natural language processing. In this paper, we apply a trust region Newton method t...
Abstract--Imbalanced data sets present a particular challenge to the data mining community. Often, it is the rare event that is of interest and the cost of misclassifying the rare ...