Sciweavers

146 search results - page 2 / 30
» RoadRunner: Towards Automatic Data Extraction from Large Web...
Sort
View
SIGMOD
2003
ACM
190views Database» more  SIGMOD 2003»
13 years 10 months ago
Extracting Structured Data from Web Pages
Many web sites contain large sets of pages generated using a common template or layout. For example, Amazon lays out the author, title, comments, etc. in the same way in all its b...
Arvind Arasu, Hector Garcia-Molina
WEBDB
2010
Springer
156views Database» more  WEBDB 2010»
13 years 10 months ago
Redundancy-Driven Web Data Extraction and Integration
A large number of web sites publish pages containing structured information about recognizable concepts, but these data are only partially used by current applications. Although s...
Paolo Papotti, Valter Crescenzi, Paolo Merialdo, M...
BTW
2005
Springer
125views Database» more  BTW 2005»
13 years 10 months ago
Web Data Extraction for Business Intelligence: The Lixto Approach
: Knowledge about market developments and competitor activities on the market becomes more and more a critical success factor for enterprises. The World Wide Web provides public do...
Georg Gottlob
IJCAI
2003
13 years 6 months ago
Integrating Information to Bootstrap Information Extraction from Web Sites
In this paper we propose a methodology to learn to extract domain-specific information from large repositories (e.g. the Web) with minimum user intervention. Learning is seeded b...
Fabio Ciravegna, Alexiei Dingli, David Guthrie, Yo...
CAISE
2003
Springer
13 years 10 months ago
Extending an on-line information site with accurate domain-dependent extracts from the World Wide Web
This paper describes a new procedure that has been developed for extending an existing on-line information system about The Voyages of the Beagle with information collected automat...
Enrique Alfonseca, Pilar Rodríguez