An increasing number of databases are becoming Web accessible through form-based search interfaces, and many of these sources are database-driven E-commerce sites. It is a daunting...
The AutoFeed system automatically extracts data from semistructured web sites. Previously, researchers have developed two types of supervised learning approaches for extracting we...
In the same way that Wikis have become the mechanism that has enabled groups of users to collaborate on the production of hypertexts on the web, Semantic Wikis promise a future of...
David E. Millard, Chris Bailey, Philip Boulain, Sw...
Web crawlers are increasingly used for focused tasks such as the extraction of data from Wikipedia or the analysis of social networks like last.fm. In these cases, pages are far m...
Franziska von dem Bussche, Klara A. Weiand, Benedi...
We consider the problem of efficiently sampling Web search engine query results. In turn, using a small random sample instead of the full set of results leads to efficient approxi...
Aris Anagnostopoulos, Andrei Z. Broder, David Carm...