Sciweavers

32 search results - page 4 / 7
» Sampling, information extraction and summarisation of Hidden...
Sort
View
TOIS
2008
145views more  TOIS 2008»
14 years 9 months ago
Classification-aware hidden-web text database selection
Many valuable text databases on the web have non-crawlable contents that are "hidden" behind search interfaces. Metasearchers are helpful tools for searching over multip...
Panagiotis G. Ipeirotis, Luis Gravano
WEBDB
1998
Springer
96views Database» more  WEBDB 1998»
15 years 1 months ago
Extracting Patterns and Relations from the World Wide Web
The World Wide Web is a vast resource for information. At the same time it is extremely distributed. A particular type of data such as restaurant lists maybe scattered across thous...
Sergey Brin
ICDM
2006
IEEE
164views Data Mining» more  ICDM 2006»
15 years 3 months ago
Unsupervised Learning of Tree Alignment Models for Information Extraction
We propose an algorithm for extracting fields from HTML search results. The output of the algorithm is a database table– a data structure that better lends itself to high-level...
Philip Zigoris, Damian Eads, Yi Zhang
WEBDB
2010
Springer
156views Database» more  WEBDB 2010»
15 years 2 months ago
Redundancy-Driven Web Data Extraction and Integration
A large number of web sites publish pages containing structured information about recognizable concepts, but these data are only partially used by current applications. Although s...
Paolo Papotti, Valter Crescenzi, Paolo Merialdo, M...
CIKM
2009
Springer
15 years 2 months ago
An empirical study on using hidden markov model for search interface segmentation
This paper describes a hidden Markov model (HMM) based approach to perform search interface segmentation. Automatic processing of an interface is a must to access the invisible co...
Ritu Khare, Yuan An