Sciweavers

158 search results - page 27 / 32
» sigir 2008
Sort
View
SIGIR
2008
ACM
14 years 10 months ago
SpotSigs: robust and efficient near duplicate detection in large web collections
Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and matching sig...
Martin Theobald, Jonathan Siddharth, Andreas Paepc...
SIGIR
2008
ACM
14 years 10 months ago
A unified and discriminative model for query refinement
This paper addresses the issue of query refinement, which involves reformulating ill-formed search queries in order to enhance relevance of search results. Query refinement typica...
Jiafeng Guo, Gu Xu, Hang Li, Xueqi Cheng
79
Voted
SIGIR
2008
ACM
14 years 10 months ago
Asymmetric distance estimation with sketches for similarity search in high-dimensional spaces
Efficient similarity search in high-dimensional spaces is important to content-based retrieval systems. Recent studies have shown that sketches can effectively approximate L1 dist...
Wei Dong, Moses Charikar, Kai Li
SIGIR
2008
ACM
14 years 10 months ago
A few examples go a long way: constructing query models from elaborate query formulations
We address a specific enterprise document search scenario, where the information need is expressed in an elaborate manner. In our scenario, information needs are expressed using a...
Krisztian Balog, Wouter Weerkamp, Maarten de Rijke
SIGIR
2008
ACM
14 years 10 months ago
A simple and efficient sampling method for estimating AP and NDCG
We consider the problem of large scale retrieval evaluation. Recently two methods based on random sampling were proposed as a solution to the extensive effort required to judge te...
Emine Yilmaz, Evangelos Kanoulas, Javed A. Aslam