Web spam can significantly deteriorate the quality of search engines. Early web spamming techniques mainly manipulate page content. Since linkage information is widely used in we...
Query-independent features (also called document priors), such as the number of incoming links to a document, its Page-Rank, or the type of its associated URL, have been successfu...
Hierarchical models are commonly used to organize a Website's content. A Website's content structure can be represented by a topic hierarchy, a directed tree rooted at a...
To browse the Web, blind people have to use screen readers, which process pages sequentially, making browsing timeconsuming. We present a prototype system, CSurf, which provides a...
Jalal Mahmud, Yevgen Borodin, Dipanjan Das, I. V. ...
A major challenge in developing models for hypertext retrieval is to effectively combine content information with the link structure available in hypertext collections. Although s...