In this paper we propose a methodology to learn to extract domain-specific information from large repositories (e.g. the Web) with minimum user intervention. Learning is seeded b...
Fabio Ciravegna, Alexiei Dingli, David Guthrie, Yo...
This paper presents a novel method for extracting information from collections of Web pages across different sites. Our method uses a standard wrapper induction algorithm and explo...
Online auction Web sites are fast changing, highly dynamic, and complex as they involve tremendous sellers and potential buyers, as well as a huge amount of items listed for biddi...
We propose an algorithm for extracting fields from HTML search results. The output of the algorithm is a database table– a data structure that better lends itself to high-level...
Data Webhouses are used to retain all the information related to web user's behavior within a web site, working as a shared repository of business data. The advent of e-busin...