Sciweavers

311 search results - page 26 / 63
» Cleaning Web Pages for Effective Web Content Mining
Sort
View
HT
2009
ACM
15 years 4 months ago
Interpreting the layout of web pages
Web pages such as news and shopping sites often use modular layouts. When used effectively this practice allows authors to present clearly large amounts of information in a single...
Luis Francisco-Revilla, Jeff Crow
VLDB
2004
ACM
103views Database» more  VLDB 2004»
15 years 2 months ago
WIC: A General-Purpose Algorithm for Monitoring Web Information Sources
The Web is becoming a universal information dissemination medium, due to a number of factors including its support for content dynamicity. A growing number of Web information prov...
Sandeep Pandey, Kedar Dhamdhere, Christopher Olsto...
WWW
2008
ACM
15 years 10 months ago
Validating the use and role of visual elements of web pages in navigation with an eye-tracking study
This paper presents an eye-tracking study that examines how people use the visual elements of Web pages to complete certain tasks. Whilst these elements are available to play thei...
Yeliz Yesilada, Caroline Jay, Robert Stevens, Simo...
WWW
2006
ACM
15 years 3 months ago
Do not crawl in the DUST: different URLs with similar text
We consider the problem of dust: Different URLs with Similar Text. Such duplicate URLs are prevalent in web sites, as web server software often uses aliases and redirections, and...
Uri Schonfeld, Ziv Bar-Yossef, Idit Keidar
CHI
2008
ACM
15 years 10 months ago
Large scale analysis of web revisitation patterns
Our work examines Web revisitation patterns. Everybody revisits Web pages, but their reasons for doing so can differ depending on the particular Web page, their topic of interest,...
Eytan Adar, Jaime Teevan, Susan T. Dumais