Sciweavers

APWEB
2008
Springer

A Method for Web Information Extraction

13 years 5 months ago
A Method for Web Information Extraction
The Word Wide Web has becoming one of the most important information repositories. However, information in web pages is free of standards in presentation, without being organized in well format. It is a challenging work to extract appropriate and useful information from Web pages. Currently, many web extraction systems called web wrappers, either semiautomatically or fully automatically, have been developed. In this paper, some existing techniques are investigated, then our current work on web information extraction is presented. In our design, we have classified the patterns of information into static and non-static structures, and use different technique to extract the relevant information. In our implementation, patterns are represented with XSL files, and all the extracted information is packaged into a machinereadable format of XML.
Man I. Lam, Zhiguo Gong, Maybin K. Muyeba
Added 12 Oct 2010
Updated 12 Oct 2010
Type Conference
Year 2008
Where APWEB
Authors Man I. Lam, Zhiguo Gong, Maybin K. Muyeba
Comments (0)