Sciweavers

146 search results - page 3 / 30
» RoadRunner: Towards Automatic Data Extraction from Large Web...
Sort
View
PVLDB
2010
112views more  PVLDB 2010»
13 years 3 months ago
Towards The Web of Concepts: Extracting Concepts from Large Datasets
Concepts are sequences of words that represent real or imaginary entities or ideas that users are interested in. As a first step towards building a web of concepts that will form...
Aditya G. Parameswaran, Hector Garcia-Molina, Anan...
HIS
2003
13 years 6 months ago
Data Mining of Web Access Logs From an Academic Web Site
We have used a general purpose data mining tool to determine whether we can find any ‘golden nuggets’ in the web access logs of a large academic web site. Our goal was to use...
Victor Ciesielski, A. Lalani
WWW
2005
ACM
14 years 6 months ago
Web data extraction based on partial tree alignment
This paper studies the problem of extracting data from a Web page that contains several structured data records. The objective is to segment these data records, extract data items...
Yanhong Zhai, Bing Liu
DEBU
2000
95views more  DEBU 2000»
13 years 5 months ago
Accurately and Reliably Extracting Data from the Web: A Machine Learning Approach
A critical problem in developing information agents for the Web is accessing data that is formatted for human use. We have developed a set of tools for extracting data from web si...
Craig A. Knoblock, Kristina Lerman, Steven Minton,...
CIKM
2003
Springer
13 years 10 months ago
Extracting unstructured data from template generated web documents
We propose a novel approach that identifies web page templates and extracts the unstructured data. Extracting only the body of the page and eliminating the template increases the ...
Ling Ma, Nazli Goharian, Abdur Chowdhury, Misun Ch...