Fully automatic wrapper generation for search engines

16 years 8 months ago

Download www.cs.binghamton.edu

When a query is submitted to a search engine, the search engine returns a dynamically generated result page containing the result records, each of which usually consists of a link to and/or snippet of a retrieved Web page. In addition, such a result page often also contains information irrelevant to the query, such as information related to the hosting site of the search engine and advertisements. In this paper, we present a technique for automatically producing wrappers that can be used to extract search result records from dynamically generated result pages returned by search engines. Automatic search result record extraction is very important for many applications that need to interact with search engines such as automatic construction and maintenance of metasearch engines and deep Web crawling. The novel aspect of the proposed technique is that it utilizes both the visual content features on the result page as displayed on a browser and the HTML tag structures of the HTML source f...

Hongkun Zhao, Weiyi Meng, Zonghuan Wu, Vijay Ragha

Real-time Traffic

Internet Technology | Keywords Information Extraction | Search Engine | Search Engines | WWW 2005 |

claim paper

» MySearchView a customized metasearch engine generator

» Database Wrappers Development Towards Automatic Generation

» Wrapper Generation for Web Accessible Data Sources

» Automatic extraction of clickable structured web contents for name entity queries

» Mining templates from search result records of search engines

» Automatic Extraction of Publication Time from News Search Results

» A Supervised Visual Wrapper Generator for WebData Extraction

» Interactive wrapper generation with minimal user effort

Post Info
More Details (n/a)

Added	22 Nov 2009
Updated	22 Nov 2009
Type	Conference
Year	2005
Where	WWW
Authors	Hongkun Zhao, Weiyi Meng, Zonghuan Wu, Vijay Raghavan, Clement T. Yu

Comments (0)

Sciweavers

Fully automatic wrapper generation for search engines

Internet Technology | Keywords Information Extraction | Search Engine | Search Engines | WWW 2005 |

Explore & Download

Productivity Tools

Sciweavers