This paper presents a cluster validation based document clustering algorithm, which is capable of identifying both important feature words and true model order (cluster number). I...
We propose and test an objective criterion for evaluation of clustering performance: How well does a clustering algorithm run on unlabeled data aid a classification algorithm? The...
A wrapped feature selection process is proposed in the context of robust clustering based on Laplace mixture models. The clustering approach we consider is a generalization of the...
We report performance evaluation of our automatic feature discovery method on the publicly available Gisette dataset: a set of 29 features discovered by our method ranks 129 among...
Abstract. A major characteristic of text document categorization problems is the extremely high dimensionality of text data. In this paper we explore the usability of the Oscillati...