Sciweavers

1947 search results - page 147 / 390
» On the Automatic Extraction of Data from the Hidden Web
Sort
View
VLDB
2005
ACM
141views Database» more  VLDB 2005»
15 years 3 months ago
Automatic Data Fusion with HumMer
Heterogeneous and dirty data is abundant. It is stored under different, often opaque schemata, it represents identical real-world objects multiple times, causing duplicates, and ...
Alexander Bilke, Jens Bleiholder, Christoph Bö...
SIGMOD
2006
ACM
107views Database» more  SIGMOD 2006»
15 years 10 months ago
Documentum ECI self-repairing wrappers: performance analysis
Documentum Enterprise Content Integration (ECI) services is a content integration middleware that provides one-query access to the Intranet and Internet content resources. The ECI...
Boris Chidlovskii, Bruno Roustant, Marc Brette
WWW
2010
ACM
15 years 5 months ago
Not so creepy crawler: easy crawler generation with standard xml queries
Web crawlers are increasingly used for focused tasks such as the extraction of data from Wikipedia or the analysis of social networks like last.fm. In these cases, pages are far m...
Franziska von dem Bussche, Klara A. Weiand, Benedi...
ICWE
2007
Springer
15 years 4 months ago
Fixing Weakly Annotated Web Data Using Relational Models
In this paper, we present a fast and scalable Bayesian model for improving weakly annotated data – which is typically generated by a (semi) automated information extraction (IE) ...
Fatih Gelgi, Srinivas Vadrevu, Hasan Davulcu
CAISE
2003
Springer
15 years 3 months ago
Real-time Data Warehousing with Temporal Requirements
Abstract. Flexibility to react on rapidly changing general conditions of the environment has become a key factor for economic success of any company. The competitiveness of an ente...
Francisco Araque