Sciweavers

WEBDB
1999
Springer

Web Ecology: Recycling HTML Pages as XML Documents Using W4F

13 years 8 months ago
Web Ecology: Recycling HTML Pages as XML Documents Using W4F
In this paper we present the World-Wide Web Wrapper Factory (W4F), a Java toolkit to generate wrappers for Web data sources. Some key features of W4F are an expressive language to extract information from HTML pages in a structured way, a mapping to export it as XML documents and some visual tools to assist the user during wrapper creation. Moreover, the entire description of wrappers is fully declarative. As an illustration, we demonstrate how to use W4F to create XML gateways, that serve transparently and on-the- y HTML pages as XML documents with their DTDs.
Arnaud Sahuguet, Fabien Azavant
Added 05 Aug 2010
Updated 05 Aug 2010
Type Conference
Year 1999
Where WEBDB
Authors Arnaud Sahuguet, Fabien Azavant
Comments (0)