Co-citation (number of nodes linking to both of a given pair of nodes) is often used heuristically to judge similarity between nodes in a complex network. We investigate the relat...
In the original PageRank algorithm for improving the ranking of search-query results, a single PageRank vector is computed, using the link structure of the Web, to capture the rel...
Many machine learning technologies such as support vector machines, boosting, and neural networks have been applied to the ranking problem in information retrieval. However, since...
Tao Qin, Xu-Dong Zhang, Ming-Feng Tsai, De-Sheng W...
Combating Web spam is one of the greatest challenges for Web search engines. State-of-the-art anti-spam techniques focus mainly on detecting varieties of spam strategies, such as ...
Chao Wei, Yiqun Liu, Min Zhang, Shaoping Ma, Liyun...
We describe an HTML web page segmentation algorithm, which is applied to segment online medical journal articles (regular HTML and PDF-Converted-HTML files). The web page content ...