Sciweavers

226 search results - page 33 / 46
» Web Page Clustering Using Heuristic Search in the Web Graph
Sort
View
87
Voted
KDD
2007
ACM
155views Data Mining» more  KDD 2007»
15 years 10 months ago
Mining templates from search result records of search engines
Metasearch engine, Comparison-shopping and Deep Web crawling applications need to extract search result records enwrapped in result pages returned from search engines in response ...
Hongkun Zhao, Weiyi Meng, Clement T. Yu
CIKM
2011
Springer
13 years 9 months ago
Probabilistic near-duplicate detection using simhash
This paper offers a novel look at using a dimensionalityreduction technique called simhash [8] to detect similar document pairs in large-scale collections. We show that this algo...
Sadhan Sood, Dmitri Loguinov
CIKM
2009
Springer
15 years 4 months ago
A general markov framework for page importance computation
We propose a General Markov Framework for computing page importance. Under the framework, a Markov Skeleton Process is used to model the random walk conducted by the web surfer on...
Bin Gao, Tie-Yan Liu, Zhiming Ma, Taifeng Wang, Ha...
CIKM
2008
Springer
14 years 11 months ago
A random walk on the red carpet: rating movies with user reviews and pagerank
Although PageRank has been designed to estimate the popularity of Web pages, it is a general algorithm that can be applied to the analysis of other graphs other than one of hypert...
Derry Tanti Wijaya, Stéphane Bressan
WSDM
2009
ACM
125views Data Mining» more  WSDM 2009»
15 years 4 months ago
Less is more: sampling the neighborhood graph makes SALSA better and faster
In this paper, we attempt to improve the effectiveness and the efficiency of query-dependent link-based ranking algorithms such as HITS, MAX and SALSA. All these ranking algorith...
Marc Najork, Sreenivas Gollapudi, Rina Panigrahy