We describe an adaptive method for extracting records from web pages. Our algorithm combines a weighted tree matching metric with clustering for obtaining data extraction patterns...
Abstract. Distributed crawling has shown that it can overcome important limitations of the centralized crawling paradigm. However, the distributed nature of current distributed cra...
Abstract. We present the main features of a system designed to support the development and delivery of web applications through concepts for modularity, reuse and rapid prototyping...
The subject of this paper is the semi-automatic construction of taxonomies over the Web. We address the problem of discovering high-quality resources that belong in a particular n...
Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopala...
In this paper we describe a novel approach to the incremental, semi-automated method for composition of web services in a geographical domain. First, we present the incremental co...