Sciweavers

43 search results - page 1 / 9
» Crawling the Content Hidden Behind Web Forms
Sort
View
ICCSA
2007
Springer
13 years 11 months ago
Crawling the Content Hidden Behind Web Forms
The crawler engines of today cannot reach most of the information contained in the Web. A great amount of valuable information is “hidden” behind the query forms of online data...
Manuel Álvarez, Juan Raposo, Alberto Pan, F...
WWW
2001
ACM
14 years 5 months ago
Crawling the Hidden Web
Current-day crawlers retrieve content only from the publicly indexable Web, i.e., the set of Web pages reachable purely by following hypertext links, ignoring search forms and pag...
Sriram Raghavan, Hector Garcia-Molina
PVLDB
2008
124views more  PVLDB 2008»
13 years 4 months ago
Google's Deep Web crawl
The Deep Web, i.e., content hidden behind HTML forms, has long been acknowledged as a significant gap in search engine coverage. Since it represents a large portion of the structu...
Jayant Madhavan, David Ko, Lucja Kot, Vignesh Gana...
JUCS
2008
124views more  JUCS 2008»
13 years 4 months ago
Structure-Based Crawling in the Hidden Web
: The number of applications that need to crawl the Web to gather data is growing at an ever increasing pace. In some cases, the criterion to determine what pages must be included ...
Márcio L. A. Vidal, Altigran Soares da Silv...
SIGMOD
2009
ACM
167views Database» more  SIGMOD 2009»
14 years 5 months ago
HDSampler: revealing data behind web form interfaces
A large number of online databases are hidden behind the web. Users to these systems can form queries through web forms to retrieve a small sample of the database. Sampling such h...
Anirban Maiti, Arjun Dasgupta, Nan Zhang, Gautam D...