Sciweavers

96 search results - page 4 / 20
» Hardness of String Similarity Search and Other Indexing Prob...
Sort
View
SIGIR
2009
ACM
15 years 4 months ago
Brute force and indexed approaches to pairwise document similarity comparisons with MapReduce
This paper explores the problem of computing pairwise similarity on document collections, focusing on the application of “more like this” queries in the life sciences domain. ...
Jimmy J. Lin
DEXAW
2008
IEEE
120views Database» more  DEXAW 2008»
15 years 4 months ago
Self-Similarity Metric for Index Pruning in Conceptual Vector Space Models
— One of the critical issues in search engines is the size of search indexes: as the number of documents handled by an engine increases, the search must preserve its efficiency,...
Dario Bonino, Fulvio Corno
VLDB
2003
ACM
147views Database» more  VLDB 2003»
15 years 9 months ago
RE-tree: an efficient index structure for regular expressions
Due to their expressive power, Regular Expressions (REs) are quickly becoming an integral part of language specifications for several important application scenarios. Many of thes...
Chee Yong Chan, Minos N. Garofalakis, Rajeev Rasto...
ICDAR
2011
IEEE
13 years 9 months ago
Word Retrieval in Historical Document Using Character-Primitives
Word searching and indexing in historical document collections is a challenging problem because, characters in these documents are often touching or broken due to degradation/agei...
Partha Pratim Roy, Jean-Yves Ramel, Nicolas Ragot
BIBM
2009
IEEE
206views Bioinformatics» more  BIBM 2009»
15 years 4 months ago
Application of Kernel Functions for Accurate Similarity Search in Large Chemical Databases
Background: Similaritysearch in chemical structure databases is an important problem with many applications in chemical genomics, drug design, and efficient chemical probe screeni...
Xiaohong Wang, Jun Huan, Aaron M. Smalter, Gerald ...