Sciweavers

27 search results - page 5 / 6
» Extraction of Flat and Nested Data Records from Web Pages
Sort
View
WWW
2010
ACM
13 years 5 months ago
Exploiting content redundancy for web information extraction
We propose a novel extraction approach that exploits content redundancy on the web to extract structured data from template-based web sites. We start by populating a seed database...
Pankaj Gulhane, Rajeev Rastogi, Srinivasan H. Seng...
WIDM
2005
ACM
13 years 11 months ago
Web path recommendations based on page ranking and Markov models
Markov models have been widely used for modelling users' navigational behaviour in the Web graph, using the transitional probabilities between web pages, as recorded in the w...
Magdalini Eirinaki, Michalis Vazirgiannis, Dimitri...
SIGKDD
2010
111views more  SIGKDD 2010»
13 years 10 days ago
Unexpected results in automatic list extraction on the web
The discovery and extraction of general lists on the Web continues to be an important problem facing the Web mining community. There have been numerous studies that claim to autom...
Tim Weninger, Fabio Fumarola, Rick Barber, Jiawei ...
WWW
2001
ACM
14 years 6 months ago
IEPAD: information extraction based on pattern discovery
The research in information extraction (IE) regards the generation of wrappers that can extract particular information from semistructured Web documents. Similar to compiler gener...
Chia-Hui Chang, Shao-Chen Lui
CACM
1998
110views more  CACM 1998»
13 years 5 months ago
Viewing WISs as Database Applications
abstraction for modeling these problems is to view the Web as a collection of (usually small and heterogeneous) databases, and to view programs that extract and process Web data au...
Gustavo O. Arocena, Alberto O. Mendelzon