Sciweavers

WWW
2007
ACM

A link classification based approach to website topic hierarchy generation

14 years 4 months ago
A link classification based approach to website topic hierarchy generation
Hierarchical models are commonly used to organize a Website's content. A Website's content structure can be represented by a topic hierarchy, a directed tree rooted at a Website's homepage in which the vertices and edges correspond to Web pages and hyperlinks. In this work, we propose a new method for constructing the topic hierarchy of a Website. We model the Website's link structure using weighted directed graph, in which the edge weights are computed using a classifier that predicts if an edge connects a pair of nodes representing a topic and a subtopic. We then pose the problem of building the topic hierarchy as finding the shortest-path tree and directed minimum spanning tree in the weighted graph. We've done extensive experiments using real Websites and obtained very promising results. Categories and Subject Descriptors H.3.3 [Information Systems]: Information Search and Retrieval ? search process, retrieval models General Terms: Algorithms, Experimentat...
Nan Liu, Christopher C. Yang
Added 21 Nov 2009
Updated 21 Nov 2009
Type Conference
Year 2007
Where WWW
Authors Nan Liu, Christopher C. Yang
Comments (0)