We present in this paper ObjectRunner, a system for extracting, integrating and querying structured data from the Web. Our system harvests real-world items from template-based HTM...
A large number of online databases are hidden behind the web. Users to these systems can form queries through web forms to retrieve a small sample of the database. Sampling such h...
Anirban Maiti, Arjun Dasgupta, Nan Zhang, Gautam D...
Web search engines like Google have made us all smarter by providing ready access to the world's knowledge whenever we need to look up a fact, learn about a topic or evaluate...
The rapid growth of the web has been noted and tracked extensively. Recent studies have however documented the dual phenomenon: web pages have small half lives, and thus the web e...
Ziv Bar-Yossef, Andrei Z. Broder, Ravi Kumar, Andr...
Abstract. The Web and the Semantic Web, as we see it, can be understood as a “living organism” combining autonomously evolving data sources, each of them possibly reacting to e...