Sciweavers

SIGIR
2004
ACM
13 years 9 months ago
Parameterized generation of labeled datasets for text categorization based on a hierarchical directory
Although text categorization is a burgeoning area of IR research, readily available test collections in this field are surprisingly scarce. We describe a methodology and system (...
Dmitry Davidov, Evgeniy Gabrilovich, Shaul Markovi...
SIGIR
2004
ACM
13 years 9 months ago
Answer models for question answering passage retrieval
Answer patterns have been shown to improve the performance of open-domain factoid QA systems. Their use, however, requires either constructing the patterns manually or developing ...
Andrés Corrada-Emmanuel, W. Bruce Croft
SIGIR
2004
ACM
13 years 9 months ago
Constructing a text corpus for inexact duplicate detection
As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. The goal of this work i...
Jack G. Conrad, Cindy P. Schriber
SIGIR
2004
ACM
13 years 9 months ago
Evaluating content-based filters for image and video retrieval
This paper investigates the level of metadata accuracy required for image filters to be valuable to users. Access to large digital image and video collections is hampered by ambig...
Michael G. Christel, Neema Moraveji, Chang Huang
SIGIR
2004
ACM
13 years 9 months ago
Knowing Where to Search: Personalized Search Strategies for Peers in P2P Networks
Optimizing and focusing search and results ranking in P2P networks becomes more and more important with the increasing size of these networks. Even though a few approaches have al...
Paul-Alexandru Chirita, Wolfgang Nejdl, Oana Scurt...
SIGIR
2004
ACM
13 years 9 months ago
Translating unknown queries with web corpora for cross-language information retrieval
It is crucial for cross-language information retrieval (CLIR) systems to deal with the translation of unknown queries1 due to that real queries might be short. The purpose of this...
Pu-Jen Cheng, Jei-Wen Teng, Ruey-Cheng Chen, Jenq-...
SIGIR
2004
ACM
13 years 9 months ago
Subwebs for specialized search
We describe a method to define and use subwebs, user-defined neighborhoods of the Internet. Subwebs help improve search performance by inducing a topic-specific page relevance ...
Raman Chandrasekar, Harr Chen, Simon Corston-Olive...
SIGIR
2004
ACM
13 years 9 months ago
GaP: a factor model for discrete data
We present a probabilistic model for a document corpus that combines many of the desirable features of previous models. The model is called “GaP” for Gamma-Poisson, the distri...
John F. Canny