Sciweavers

48 search results - page 7 / 10
» A Comparison of Techniques for Sampling Web Pages
Sort
View
ICDM
2008
IEEE
186views Data Mining» more  ICDM 2008»
15 years 5 months ago
xCrawl: A High-Recall Crawling Method for Web Mining
Web Mining Systems exploit the redundancy of data published on the Web to automatically extract information from existing web documents. The first step in the Information Extract...
Kostyantyn M. Shchekotykhin, Dietmar Jannach, Gerh...
VLDB
2011
ACM
251views Database» more  VLDB 2011»
14 years 6 months ago
Harvesting relational tables from lists on the web
A large number of web pages contain data structured in the form of “lists”. Many such lists can be further split into multi-column tables, which can then be used in more seman...
Hazem Elmeleegy, Jayant Madhavan, Alon Y. Halevy
CIKM
2009
Springer
15 years 5 months ago
Compact full-text indexing of versioned document collections
We study the problem of creating highly compressed fulltext index structures for versioned document collections, that is, collections that contain multiple versions of each docume...
Jinru He, Hao Yan, Torsten Suel
IJSI
2008
115views more  IJSI 2008»
14 years 11 months ago
Towards Knowledge Acquisition from Semi-Structured Content
Abstract A rich family of generic Information Extraction (IE) techniques have been developed by researchers nowadays. This paper proposes WebKER, a system for automatically extract...
Xi Bai, Jigui Sun, Haiyan Che, Lian Shi
COMCOM
2007
84views more  COMCOM 2007»
14 years 11 months ago
A user-focused evaluation of web prefetching algorithms
Web prefetching mechanisms have been proposed to benefit web users by hiding the download latencies. Nevertheless, to the knowledge of the authors, there is no attempt to compare...
Josep Domènech, Ana Pont, Julio Sahuquillo,...