Sciweavers

SIGIR
2006
ACM
13 years 10 months ago
Building a test collection for complex document information processing
Research and development of information access technology for scanned paper documents has been hampered by the lack of public test collections of realistic scope and complexity. A...
David D. Lewis, Gady Agam, Shlomo Argamon, Ophir F...
SIGIR
2006
ACM
13 years 10 months ago
Is XML retrieval meaningful to users?: searcher preferences for full documents vs. elements
The aim of this study is to investigate whether element retrieval (as opposed to full-text retrieval) is meaningful and useful for searchers when carrying out information-seeking ...
Birger Larsen, Anastasios Tombros, Saadia Malik
SIGIR
2006
ACM
13 years 10 months ago
User modeling for full-text federated search in peer-to-peer networks
User modeling for information retrieval has mostly been studied to improve the effectiveness of information access in centralized repositories. In this paper we explore user model...
Jie Lu, James P. Callan
SIGIR
2006
ACM
13 years 10 months ago
Learning to advertise
Content-targeted advertising, the task of automatically associating ads to a Web page, constitutes a key Web monetization strategy nowadays. Further, it introduces new challenging...
Anísio Lacerda, Marco Cristo, Marcos Andr&e...
SIGIR
2006
ACM
13 years 10 months ago
Respect my authority!: HITS without hyperlinks, utilizing cluster-based language models
We present an approach to improving the precision of an initial document ranking wherein we utilize cluster information within a graph-based framework. The main idea is to perform...
Oren Kurland, Lillian Lee
SIGIR
2006
ACM
13 years 10 months ago
Text clustering with extended user feedback
Text clustering is most commonly treated as a fully automated task without user feedback. However, a variety of researchers have explored mixed-initiative clustering methods which...
Yifen Huang, Tom M. Mitchell
SIGIR
2006
ACM
13 years 10 months ago
Information retrieval with commonsense knowledge
This paper employs ConceptNet, which covers a rich set of commonsense concepts, to retrieve images with text descriptions by focusing on spatial relationships. Evaluation on test ...
Ming-Hung Hsu, Hsin-Hsi Chen
SIGIR
2006
ACM
13 years 10 months ago
Identifying comparative sentences in text documents
This paper studies the problem of identifying comparative sentences in text documents. The problem is related to but quite different from sentiment/opinion sentence identification...
Nitin Jindal, Bing Liu
SIGIR
2006
ACM
13 years 10 months ago
A framework to predict the quality of answers with non-textual features
New types of document collections are being developed by various web services. The service providers keep track of non-textual features such as click counts. In this paper, we pre...
Jiwoon Jeon, W. Bruce Croft, Joon Ho Lee, Soyeon P...
SIGIR
2006
ACM
13 years 10 months ago
Finding near-duplicate web pages: a large-scale evaluation of algorithms
Broder et al.’s [3] shingling algorithm and Charikar’s [4] random projection based approach are considered “state-of-theart” algorithms for finding near-duplicate web pag...
Monika Rauch Henzinger