Sciweavers

152 search results - page 1 / 31
» Redundancy-Driven Web Data Extraction and Integration
Sort
View
WEBDB
2010
Springer
156views Database» more  WEBDB 2010»
13 years 9 months ago
Redundancy-Driven Web Data Extraction and Integration
A large number of web sites publish pages containing structured information about recognizable concepts, but these data are only partially used by current applications. Although s...
Paolo Papotti, Valter Crescenzi, Paolo Merialdo, M...
LREC
2010
237views Education» more  LREC 2010»
13 years 6 months ago
Entity Mention Detection using a Combination of Redundancy-Driven Classifiers
We present an experimental framework for Entity Mention Detection in which two different classifiers are combined to exploit Data Redundancy attained through the annotation of a l...
Silvana Marianela Bernaola Biggio, Manuela Speranz...
WWW
2011
ACM
12 years 11 months ago
HyLiEn: a hybrid approach to general list extraction on the web
We consider the problem of automatically extracting general lists from the web. Existing approaches are mostly dependent upon either the underlying HTML markup or the visual struc...
Fabio Fumarola, Tim Weninger, Rick Barber, Donato ...
PVLDB
2010
114views more  PVLDB 2010»
13 years 2 months ago
ObjectRunner: Lightweight, Targeted Extraction and Querying of Structured Web Data
We present in this paper ObjectRunner, a system for extracting, integrating and querying structured data from the Web. Our system harvests real-world items from template-based HTM...
Talel Abdessalem, Bogdan Cautis, Nora Derouiche
JMLR
2008
159views more  JMLR 2008»
13 years 4 months ago
Dynamic Hierarchical Markov Random Fields for Integrated Web Data Extraction
Existing template-independent web data extraction approaches adopt highly ineffective decoupled strategies--attempting to do data record detection and attribute labeling in two se...
Jun Zhu, Zaiqing Nie, Bo Zhang, Ji-Rong Wen