Sciweavers

27 search results - page 4 / 6
» Extraction of Flat and Nested Data Records from Web Pages
Sort
View
WWW
2007
ACM
14 years 6 months ago
U-REST: an unsupervised record extraction system
In this paper, we describe a system that can extract record structures from web pages with no direct human supervision. Records are commonly occurring HTML-embedded data tuples th...
Yuan Kui Shen, David R. Karger
AUSAI
2003
Springer
13 years 11 months ago
Semi-Automatic Construction of Metadata from a Series of Web Documents
Metadata plays an important role in discovering, collecting, extracting and aggregating Web data. This paper proposes a method of constructing metadata for a specific topic. The m...
Sachio Hirokawa, Eisuke Itoh, Tetsuhiro Miyahara
ADC
2006
Springer
130views Database» more  ADC 2006»
13 years 11 months ago
A two-phase rule generation and optimization approach for wrapper generation
Web information extraction is a fundamental issue for web information management and integrations. A common approach is to use wrappers to extract data from web pages or documents...
Yanan Hao, Yanchun Zhang
PVLDB
2010
114views more  PVLDB 2010»
13 years 4 months ago
ObjectRunner: Lightweight, Targeted Extraction and Querying of Structured Web Data
We present in this paper ObjectRunner, a system for extracting, integrating and querying structured data from the Web. Our system harvests real-world items from template-based HTM...
Talel Abdessalem, Bogdan Cautis, Nora Derouiche
ICEIS
2009
IEEE
14 years 13 days ago
Semi-supervised Information Extraction from Variable-length Web-page Lists
We propose two methods for constructing automated programs for extraction of information from a class of web pages that are very common and of high practical significance - varia...
Daniel Nikovski, Alan Esenther, Akihiro Baba