Sciweavers

SMC
2010
IEEE

Deep web data extraction

13 years 2 months ago
Deep web data extraction
—Deep Web contents are accessed by queries submitted to Web databases and the returned data records are enwrapped in dynamically generated Web pages (they will be called deep Web pages in this paper). Extracting structured data from deep Web pages is a challenging problem due to the underlying intricate structures of such pages. Until now, a large number of techniques have been proposed to address this problem, but all of them have inherent limitations because they are Web-page-programming-languagedependent. As the popular two-dimensional media, the contents on Web pages are always displayed regularly for users to browse. This motivates us to seek a different way for deep Web data extraction to overcome the limitations of previous works by utilizing some interesting common visual features on the deep Web pages. In this paper, a novel vision-based approach that is Web-pageprogramming-language-independent is proposed. This approach primarily utilizes the visual features on the deep Web...
Jer Lang Hong
Added 30 Jan 2011
Updated 30 Jan 2011
Type Journal
Year 2010
Where SMC
Authors Jer Lang Hong
Comments (0)