Sciweavers

WISE
2005
Springer

Semantic Partitioning of Web Pages

13 years 10 months ago
Semantic Partitioning of Web Pages
In this paper we describe the semantic partitioner algorithm, that uses the structural and presentation regularities of the Web pages to automatically transform them into hierarchical content structures. These content structures enable us to automatically annotate labels in the Web pages with their semantic roles, thus yielding meta-data and instance information for the Web pages. Experimental results with the TAP knowledge base and computer science department Web sites, comprising 16, 861 Web pages indicate that our algorithm is able gather meta-data accurately from various types of Web pages. The algorithm is able to achieve this performance without any domain specific engineering requirement.
Srinivas Vadrevu, Fatih Gelgi, Hasan Davulcu
Added 25 Jun 2010
Updated 25 Jun 2010
Type Conference
Year 2005
Where WISE
Authors Srinivas Vadrevu, Fatih Gelgi, Hasan Davulcu
Comments (0)