We initiate the study of sublinear-time algorithms in the external memory model [Vit01]. In this model, the data is stored in blocks of a certain size B, and the algorithm is charged a unit cost for each block access. This model is well-studied, since it reflects the computational issues occurring when the (massive) input is stored on a disk. Since each block access operates on B data elements in parallel, many problems have external memory algorithms whose number of block accesses is only a small fraction (e.g. 1/B) of their main memory complexity. However, to the best of our knowledge, no such reduction in complexity is known for any sublinear-time algorithm. One plausible explanation is that the vast majority of sublinear-time algorithms use random sampling and thus exhibit no locality of reference. This state of affairs is quite unfortunate, since both sublinear-time algorithms and the external memory model are important approaches to dealing with massive data sets, and ideally th...

Added |
14 Feb 2011

Updated |
14 Feb 2011

Type |
Journal

Year |
2010

Where |
PROPERTYTESTING

Authors |
Alexandr Andoni, Piotr Indyk, Krzysztof Onak, Ronitt Rubinfeld

