Sciweavers

SIGIR
2008
ACM
13 years 4 months ago
Semi-supervised spam filtering: does it work?
The results of the 2006 ECML/PKDD Discovery Challenge suggest that semi-supervised learning methods work well for spam filtering when the source of available labeled examples diff...
Mona Mojdeh, Gordon V. Cormack
SIGIR
2008
ACM
13 years 4 months ago
Selecting good expansion terms for pseudo-relevance feedback
Pseudo-relevance feedback assumes that most frequent terms in the pseudo-feedback documents are useful for the retrieval. In this study, we re-examine this assumption and show tha...
Guihong Cao, Jian-Yun Nie, Jianfeng Gao, Stephen R...
SIGIR
2008
ACM
13 years 4 months ago
XML-aided phrase indexing for hypertext documents
We combine techniques of XML Mining and Text Mining for the benefit of Information Retrieval. By manipulating the word sequence according to the XML structure of the marked-up tex...
Miro Lehtonen, Antoine Doucet
SIGIR
2008
ACM
13 years 4 months ago
Relevance assessment: are judges exchangeable and does it matter
We investigate to what extent people making relevance judgements for a reusable IR test collection are exchangeable. We consider three classes of judge: "gold standard" ...
Peter Bailey, Nick Craswell, Ian Soboroff, Paul Th...
SIGIR
2008
ACM
13 years 4 months ago
Query-drift prevention for robust query expansion
Pseudo-feedback-based automatic query expansion yields effective retrieval performance on average, but results in performance inferior to that of using the original query for many...
Liron Zighelnic, Oren Kurland
SIGIR
2008
ACM
13 years 4 months ago
Score standardization for inter-collection comparison of retrieval systems
The goal of system evaluation in information retrieval has always been to determine which of a set of systems is superior on a given collection. The tool used to determine system ...
William Webber, Alistair Moffat, Justin Zobel
SIGIR
2008
ACM
13 years 4 months ago
A user browsing model to predict search engine click data from past observations
Search engine click logs provide an invaluable source of relevance information but this information is biased because we ignore which documents from the result list the users have...
Georges Dupret, Benjamin Piwowarski
SIGIR
2008
ACM
13 years 4 months ago
Detecting synonyms in social tagging systems to improve content retrieval
Collaborative tagging used in online social content systems is naturally characterized by many synonyms, causing low precision retrieval. We propose a mechanism based on user pref...
Maarten Clements, Arjen P. de Vries, Marcel J. T. ...
SIGIR
2008
ACM
13 years 4 months ago
A study of query length
We analyse query length, and fit power-law and Poisson distributions to four different query sets. We provide a practical model for query length, based on the truncation of a Pois...
Avi Arampatzis, Jaap Kamps
SIGIR
2008
ACM
13 years 4 months ago
Real-time automatic tag recommendation
Yang Song, Ziming Zhuang, Huajing Li, Qiankun Zhao...