Effective document classification is a long-pursued goal in knowledge management. This paper proposes a novel hybrid approach of semantic representation and statistical measuremen...
In this paper we describe work relating to classification of web documents using a graph-based model instead of the traditional vector-based model for document representation. We ...
Adam Schenker, Mark Last, Horst Bunke, Abraham Kan...
Term signal is an existing text representation that depicts a term as a vector of frequencies of occurrences in a number of user-defined partitions of a document. Although term si...
Supphachai Thaicharoen, Tom Altman, Krzysztof J. C...
When dealing with genres of web pages, there are two important aspects to be taken into account. On the one hand, the web is fluid, unstable and fast-paced. On the other hand, gen...