Web usage mining, the analysis of user navigation paths through web sites, is a common technique for evaluating site designs or adaptive hypermedia techniques. However, often it i...
Most human activities occur around where the user is physically located. Knowing the geographical serving area of web resources, therefore, is very important for many web applicat...
Qi Zhang, Xing Xie, Lee Wang, Lihua Yue, Wei-Ying ...
We propose a novel extraction approach that exploits content redundancy on the web to extract structured data from template-based web sites. We start by populating a seed database...
Pankaj Gulhane, Rajeev Rastogi, Srinivasan H. Seng...
This paper presents a novel method for extracting information from collections of Web pages across different sites. Our method uses a standard wrapper induction algorithm and explo...
We consider the problem of template-independent news extraction. The state-of-the-art news extraction method is based on template-level wrapper induction, which has two serious li...
Junfeng Wang, Xiaofei He, Can Wang, Jian Pei, Jiaj...