Sciweavers

6103 search results - page 985 / 1221
» Multimedia Retrieval Algorithmics
Sort
View
99
Voted
WWW
2007
ACM
16 years 3 months ago
Detecting near-duplicates for web crawling
Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...
Gurmeet Singh Manku, Arvind Jain, Anish Das Sarma
110
Voted
WWW
2007
ACM
16 years 3 months ago
Query topic detection for reformulation
In this paper, we show that most multiple term queries include more than one topic and users usually reformulate their queries by topics instead of terms. In order to provide empi...
Xuefeng He, Jun Yan, Jinwen Ma, Ning Liu, Zheng Ch...
128
Voted
WWW
2006
ACM
16 years 3 months ago
Improved annotation of the blogosphere via autotagging and hierarchical clustering
Tags have recently become popular as a means of annotating and organizing Web pages and blog entries. Advocates of tagging argue that the use of tags produces a 'folksonomy&#...
Christopher H. Brooks, Nancy Montanez
104
Voted
WWW
2005
ACM
16 years 3 months ago
Partitioning of Web graphs by community topology
We introduce a stricter Web community definition to overcome boundary ambiguity of a Web community defined by Flake, Lawrence and Giles [2], and consider the problem of finding co...
Hidehiko Ino, Mineichi Kudo, Atsuyoshi Nakamura
124
Voted
WWW
2004
ACM
16 years 3 months ago
The webgraph framework I: compression techniques
Studying Web graphs is often difficult due to their large size. Recently, several proposals have been published about various techniques that allow to store a Web graph in memory ...
Paolo Boldi, Sebastiano Vigna