Sciweavers

543 search results - page 1 / 109
» Exploiting content redundancy for web information extraction
Sort
View
WWW
2010
ACM
13 years 4 months ago
Exploiting content redundancy for web information extraction
We propose a novel extraction approach that exploits content redundancy on the web to extract structured data from template-based web sites. We start by populating a seed database...
Pankaj Gulhane, Rajeev Rastogi, Srinivasan H. Seng...
LREC
2010
237views Education» more  LREC 2010»
13 years 6 months ago
Entity Mention Detection using a Combination of Redundancy-Driven Classifiers
We present an experimental framework for Entity Mention Detection in which two different classifiers are combined to exploit Data Redundancy attained through the annotation of a l...
Silvana Marianela Bernaola Biggio, Manuela Speranz...
WEBDB
2010
Springer
156views Database» more  WEBDB 2010»
13 years 9 months ago
Redundancy-Driven Web Data Extraction and Integration
A large number of web sites publish pages containing structured information about recognizable concepts, but these data are only partially used by current applications. Although s...
Paolo Papotti, Valter Crescenzi, Paolo Merialdo, M...
ESWS
2006
Springer
13 years 8 months ago
Extracting Instances of Relations from Web Documents Using Redundancy
Abstract. In this document we describe our approach to a specific subtask of ontology population, the extraction of instances of relations. We present a generic approach with which...
Viktor de Boer, Maarten van Someren, Bob J. Wielin...
ITCC
2005
IEEE
13 years 10 months ago
Elimination of Redundant Information for Web Data Mining
These days, billions of Web pages are created with HTML or other markup languages. They only have a few uniform structures and contain various authoring styles compared to traditi...
Shakirah Mohd Taib, Soon-ja Yeom, Byeong Ho Kang