In large content-based image database applications, e cient information retrieval depends heavily on good indexing structures of the extracted features. While indexing techniques f...
A framework is presented for discovering partial duplicates in large collections of scanned books with optical character recognition (OCR) errors. Each book in the collection is r...
This paper reports on work done for the Genomics Track at TREC 2004 by ConverSpeech LLC in conjunction with scientists at the Saccharomyces Genome Database (SGD), the model organi...
Colleen E. Crangle, Alex Zbyslaw, J. Michael Cherr...
Abstract--We propose an automatic method for measuring content-based music similarity, enhancing the current generation of music search engines and recommender systems. Many previo...
We consider the problem of finding officially unrecognized side effects of drugs. By submitting queries to the Web involving a given drug name, it is possible to retrieve pages co...
Carlo Curino, Yuanyuan Jia, Bruce Lambert, Patrici...