Sciweavers

SIGIR
2009
ACM
13 years 11 months ago
Positional language models for information retrieval
Although many variants of language models have been proposed for information retrieval, there are two related retrieval heuristics remaining “external” to the language modelin...
Yuanhua Lv, ChengXiang Zhai
SIGIR
2009
ACM
13 years 11 months ago
Brute force and indexed approaches to pairwise document similarity comparisons with MapReduce
This paper explores the problem of computing pairwise similarity on document collections, focusing on the application of “more like this” queries in the life sciences domain. ...
Jimmy J. Lin
SIGIR
2009
ACM
13 years 11 months ago
CompositeMap: a novel framework for music similarity measure
With the continuing advances in data storage and communication technology, there has been an explosive growth of music information from different application domains. As an effe...
Bingjun Zhang, Jialie Shen, Qiaoliang Xiang, Ye Wa...
SIGIR
2009
ACM
13 years 11 months ago
Measuring the descriptiveness of web comments
This paper investigates whether Web comments are of descriptive nature, that is, whether the combined text of a set of comments is similar in topic to the commented object. If so,...
Martin Potthast
SIGIR
2009
ACM
13 years 11 months ago
A ranking approach to keyphrase extraction
This paper addresses the issue of automatically extracting keyphrases from document. Previously, this problem was formalized as classification and learning methods for classific...
Xin Jiang, Yunhua Hu, Hang Li
SIGIR
2009
ACM
13 years 11 months ago
An improved markov random field model for supporting verbose queries
Recent work in supervised learning of term-based retrieval models has shown significantly improved accuracy can often be achieved via better model estimation [2, 10, 11, 17]. In ...
Matthew Lease
SIGIR
2009
ACM
13 years 11 months ago
A statistical comparison of tag and query logs
Mark James Carman, Mark Baillie, Robert Gwadera, F...
SIGIR
2009
ACM
13 years 11 months ago
SUSHI: scoring scaled samples for server selection
Modern techniques for distributed information retrieval use a set of documents sampled from each server, but these samples have been underutilised in server selection. We describe...
Paul Thomas, Milad Shokouhi
SIGIR
2009
ACM
13 years 11 months ago
On the relative age of spam and ham training samples for email filtering
Email spam filters are commonly trained on a sample of spam and ham (non-spam) messages. We investigate the effect on filter performance of using samples of spam and ham messag...
Gordon V. Cormack, Jose-Marcio Martins da Cruz
SIGIR
2009
ACM
13 years 11 months ago
Segment-level display time as implicit feedback: a comparison to eye tracking
We examine two basic sources for implicit relevance feedback on the segment level for search personalization: eye tracking and display time. A controlled study has been conducted ...
Georg Buscher, Ludger van Elst, Andreas Dengel