Sciweavers

ICPR
2008
IEEE

Categorization using semi-supervised clustering

13 years 11 months ago
Categorization using semi-supervised clustering
Many applications require matching objects to a predefined, yet highly dynamic set of categories accompanied by category descriptions. We present a novel approach to solving this class of categorization problems by formulating it in a semi-supervised clustering framework. Text-based matching is performed to generate “soft” seeds, which are then used to guide clustering in the basic feature space. We introduce a new variation of the k-means algorithm, called Soft Seeded kmeans, which can effectively incorporate seeds that are of varying degrees of confidence, while allowing for incomplete coverage of the pre-defined categories. The algorithm is applied to real-world data from a business analytics application, and we demonstrate that it leads to superior performance compared to previous approaches.
Jianying Hu, Moninder Singh, Aleksandra Mojsilovic
Added 30 May 2010
Updated 30 May 2010
Type Conference
Year 2008
Where ICPR
Authors Jianying Hu, Moninder Singh, Aleksandra Mojsilovic
Comments (0)