Sciweavers

938 search results - page 100 / 188
» Space-Efficient Algorithms for Document Retrieval
Sort
View
SIGIR
2010
ACM
15 years 9 months ago
Adaptive near-duplicate detection via similarity learning
In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz
SIGIR
2002
ACM
15 years 4 months ago
Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering
A novel method for simultaneous keyphrase extraction and generic text summarization is proposed by modeling text documents as weighted undirected and weighted bipartite graphs. Sp...
Hongyuan Zha
SPIRE
2010
Springer
15 years 3 months ago
Dual-Sorted Inverted Lists
Several IR tasks rely, to achieve high efficiency, on a single pervasive data structure called the inverted index. This is a mapping from the terms in a text collection to the docu...
Gonzalo Navarro, Simon J. Puglisi
TREC
2008
15 years 6 months ago
Distributed Multisearch and Resource Selection for the TREC Million Query Track
A distributed information retrieval system with resourceselection and resultset merging capability was used to search subsets of the GOV2 document corpus for the 2008 TREC Million...
Christopher T. Fallen, Gregory B. Newby, Kylie McC...
HPDC
2003
IEEE
15 years 10 months ago
PlanetP: Using Gossiping to Build Content Addressable Peer-to-Peer Information Sharing Communities
Abstract. We present PlanetP, a peer-to-peer (P2P) content search and retrieval infrastructure targeting communities wishing to share large sets of text documents. P2P computing is...
Francisco Matias Cuenca-Acuna, Christopher Peery, ...