The number of documents that are indexed by a search engine is referred to as the size of the search engine. The information about the size of each underlying search engine is ess...
Background: Interpretation of simple microarray experiments is usually based on the fold-change of gene expression between a reference and a "treated" sample where the t...
Julie L. Morrison, Rainer Breitling, Desmond J. Hi...
A new approach has been developed for acquiring bilingual web pages from the result pages of search engines, which is composed of two challenging tasks. The first task is to detec...
We consider the coverage testing problem where we are given a document and a corpus with a limited query interface and asked to find if the corpus contains a near-duplicate of th...
Ali Dasdan, Paolo D'Alberto, Santanu Kolay, Chris ...
We address the problem of measuring global quality metrics of search engines, like corpus size, index freshness, and density of duplicates in the corpus. The recently proposed est...