This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. ...
Kamal Nigam, Andrew McCallum, Sebastian Thrun, Tom...
Metonymy recognition is generally approached with complex algorithms that rely heavily on the manual annotation of training and test data. This paper will relieve this complexity ...
In this paper we address the issue of automatically assigning information status to discourse entities. Using an annotated corpus of conversational English and exploiting morpho-s...
In this paper, we show that stylistic text features can be exploited to determine an anonymous author's native language with high accuracy. Specifically, we first use automat...
In this paper we present a method to cluster large datasets that change over time using incremental learning techniques. The approach is based on the dynamic representation of clus...