Many web links mislead human surfers and automated crawlers because they point to changed content, out-of-date information, or invalid URLs. It is a particular problem for large, ...
Link analysis has been used to enhance retrieval results of the web search for years. PageRank and HITS are the two well-known algorithms widely used by most researchers. The form...
We describe a joint probabilistic model for modeling the contents and inter-connectivity of document collections such as sets of web pages or research paper archives. The model is...
Today’s search engines retrieve tens of thousands of web pages in response to fairly simple query articulations. These pages are retrieved on the basis of the query terms occurr...
Richong Zhang, Michael A. Shepherd, Jack Duffy, Ca...
Web spam can significantly deteriorate the quality of search engines. Early web spamming techniques mainly manipulate page content. Since linkage information is widely used in we...