Sciweavers

61 search results - page 1 / 13
» Identifying primary content from web pages and its applicati...
Sort
View
WWW
2011
ACM
12 years 10 months ago
Identifying primary content from web pages and its application to web search ranking
Web pages are usually highly structured documents. In some documents, content with different functionality is laid out in blocks, some merely supporting the main discourse. In ot...
Srinivas Vadrevu, Emre Velipasaoglu
SIGIR
2005
ACM
13 years 10 months ago
Title extraction from bodies of HTML documents and its application to web page retrieval
This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...
Yunhua Hu, Guomao Xin, Ruihua Song, Guoping Hu, Sh...
ASSETS
2008
ACM
13 years 6 months ago
What's new?: making web page updates accessible
Web applications facilitated by technologies such as JavaScript, DHTML, AJAX, and Flash use a considerable amount of dynamic web content that is either inaccessible or unusable by...
Yevgen Borodin, Jeffrey P. Bigham, Rohit Raman, I....
DEXA
2006
Springer
197views Database» more  DEXA 2006»
13 years 6 months ago
Cleaning Web Pages for Effective Web Content Mining
Classifying and mining noise-free web pages will improve on accuracy of search results as well as search speed, and may benefit webpage organization applications (e.g., keyword-bas...
Jing Li, Christie I. Ezeife
HT
2010
ACM
13 years 9 months ago
Is this a good title?
Missing web pages, URIs that return the 404 “Page Not Found” error or the HTTP response code 200 but dereference unexpected content, are ubiquitous in today’s browsing exper...
Martin Klein, Jeffery L. Shipman, Michael L. Nelso...