A critical problem in developing information agents for the Web is accessing data that is formatted for human use. We have developed a set of tools for extracting data from web si...
Craig A. Knoblock, Kristina Lerman, Steven Minton,...
Describing relational data sources (i.e. databases) by means of ontologies constitutes the foundation of most of the semantic based approaches to data access and integration. In sp...
Heterogeneous and dirty data is abundant. It is stored under different, often opaque schemata, it represents identical real-world objects multiple times, causing duplicates, and ...
Alexander Bilke, Jens Bleiholder, Christoph Bö...
Web documents present new challenges to conventional Information Retrieval (IR) technologies. This paper describes how these challenges are faced in FameIR, a multilingual multime...
In this paper we present strategies for successfully capturing updates at Web sources. Web-based information agents provide integrated access to autonomous Web sources that can ge...