Sciweavers

SIGIR
2006
ACM
15 years 10 months ago
Measuring similarity of semi-structured documents with context weights
In this work, we study similarity measures for text-centric XML documents based on an extended vector space model, which considers both document content and structure. Experimenta...
Christopher C. Yang, Nan Liu
148
Voted
SIGIR
2006
ACM
15 years 10 months ago
Near-duplicate detection by instance-level constrained clustering
For the task of near-duplicated document detection, both traditional fingerprinting techniques used in database community and bag-of-word comparison approaches used in information...
Hui Yang, James P. Callan
141
Voted
SIGIR
2006
ACM
15 years 10 months ago
Probabilistic latent query analysis for combining multiple retrieval sources
Combining the output from multiple retrieval sources over the same document collection is of great importance to a number of retrieval tasks such as multimedia retrieval, web retr...
Rong Yan, Alexander G. Hauptmann
SIGIR
2006
ACM
15 years 10 months ago
A study of real-time query expansion effectiveness
In this poster, we describe the study of an interface technique that provides a list of suggested additional query terms as a searcher types a search query, in effect offering int...
Ryen W. White, Gary Marchionini
SIGIR
2006
ACM
15 years 10 months ago
LDA-based document models for ad-hoc retrieval
Search algorithms incorporating some form of topic model have a long history in information retrieval. For example, cluster-based retrieval has been studied since the 60s and has ...
Xing Wei, W. Bruce Croft
SIGIR
2006
ACM
15 years 10 months ago
Unifying user-based and item-based collaborative filtering approaches by similarity fusion
Memory-based methods for collaborative filtering predict new ratings by averaging (weighted) ratings between, respectively, pairs of similar users or items. In practice, a large ...
Jun Wang, Arjen P. de Vries, Marcel J. T. Reinders
SIGIR
2006
ACM
15 years 10 months ago
Latent semantic analysis for multiple-type interrelated data objects
Co-occurrence data is quite common in many real applications. Latent Semantic Analysis (LSA) has been successfully used to identify semantic relations in such data. However, LSA c...
Xuanhui Wang, Jian-Tao Sun, Zheng Chen, ChengXiang...
140
Voted
SIGIR
2006
ACM
15 years 10 months ago
Combining bidirectional translation and synonymy for cross-language information retrieval
This paper introduces a general framework for the use of translation probabilities in cross-language information retrieval based on the notion that information retrieval fundament...
Jianqiang Wang, Douglas W. Oard
143
Voted
SIGIR
2006
ACM
15 years 10 months ago
Why structural hints in queries do not help XML-retrieval
For many years it has been commonly held that a user who adds structural “hints” to a query will improve precision in an element retrieval search. At INEX 2005 we conducted an...
Andrew Trotman, Mounia Lalmas