Sciweavers

WWW
2010
ACM
13 years 11 months ago
The paths more taken: matching DOM trees to search logs for accurate webpage clustering
An unsupervised clustering of the webpages on a website is a primary requirement for most wrapper induction and automated data extraction methods. Since page content can vary dras...
Deepayan Chakrabarti, Rupesh R. Mehta