Sciweavers

SIGIR
2000
ACM

Hierarchical classification of Web content

13 years 9 months ago
Hierarchical classification of Web content
This paper explores the use of hierarchical structure for classifying a large, heterogeneous collection of web content. The hierarchical structure is initially used to train different second-level classifiers. In the hierarchical case, a model is learned to distinguish a second-level category from other categories within the same top level. In the flat non-hierarchical case, a model distinguishes a second-level category from all other second-level categories. Scoring rules can further take advantage of the hierarchy by considering only second-level categories that exceed a threshold at the top level. We use support vector machine (SVM) classifiers, which have been shown to be efficient and effective for classification, but not previously explored in the context of hierarchical classification. We found small advantages in accuracy for hierarchical models over flat models. For the hierarchical approach, we found the same accuracy using a sequential Boolean decision rule and a multiplica...
Susan T. Dumais, Hao Chen
Added 01 Aug 2010
Updated 01 Aug 2010
Type Conference
Year 2000
Where SIGIR
Authors Susan T. Dumais, Hao Chen
Comments (0)