We propose and test an objective criterion for evaluation of clustering performance: How well does a clustering algorithm run on unlabeled data aid a classification algorithm? The...
Extensive experimental evidence is required to study the impact of text categorization approaches on real data and to assess the performance within operational scenarios. In this ...
Roberto Basili, Alessandro Moschitti, Maria Teresa...
Cross validation allows models to be tested using the full training set by means of repeated resampling; thus, maximizing the total number of points used for testing and potential...
Background: Numerous feature selection methods have been applied to the identification of differentially expressed genes in microarray data. These include simple fold change, clas...
This paper studies the effects of training data on binary text classification and postulates that negative training data is not needed and may even be harmful for the task. Tradit...