Web spam is a widely-recognized threat to the quality and security of the Web. Web spam pages pollute search engine indexes, burden Web crawlers and Web mining services, and expos...
We propose a General Markov Framework for computing page importance. Under the framework, a Markov Skeleton Process is used to model the random walk conducted by the web surfer on...
Bin Gao, Tie-Yan Liu, Zhiming Ma, Taifeng Wang, Ha...
Abstract. In this paper we present static and dynamic studies of duplicate and near-duplicate documents in the Web. The static and dynamic studies involve the analysis of similar c...
Web link analysis has been proved to provide significant enhancement to the precision of web search in practice. Among existing approaches, Kleinberg’s HITS and Google’s PageR...
Zheng Chen, Li Tao, Jidong Wang, Liu Wenyin, Wei-Y...
Is it possible to use sense inventories to improve Web search results diversity for one word queries? To answer this question, we focus on two broad-coverage lexical resources of ...