This paper explores the problem of computing pairwise similarity on document collections, focusing on the application of “more like this” queries in the life sciences domain. ...
— One of the critical issues in search engines is the size of search indexes: as the number of documents handled by an engine increases, the search must preserve its efficiency,...
Due to their expressive power, Regular Expressions (REs) are quickly becoming an integral part of language specifications for several important application scenarios. Many of thes...
Chee Yong Chan, Minos N. Garofalakis, Rajeev Rasto...
Word searching and indexing in historical document collections is a challenging problem because, characters in these documents are often touching or broken due to degradation/agei...
Background: Similaritysearch in chemical structure databases is an important problem with many applications in chemical genomics, drug design, and efficient chemical probe screeni...
Xiaohong Wang, Jun Huan, Aaron M. Smalter, Gerald ...