We present a document routing and index partitioning scheme for scalable similarity-based search of documents in a large corpus. We consider the case when similarity-based search ...
Failures of any type are common in current datacenters, partly due to the higher scales of the data stored. As data scales up, its availability becomes more complex, while differe...
Nicolas Bonvin, Thanasis G. Papaioannou, Karl Aber...
Today’s one-pass analytics applications tend to be data-intensive in nature and require the ability to process high volumes of data efficiently. MapReduce is a popular programm...
Boduo Li, Edward Mazur, Yanlei Diao, Andrew McGreg...
Because of the high volume and unpredictable arrival rate, stream processing systems may not always be able to keep up with the input data streams-- resulting in buffer overflow a...
An important problem for many location-based applications is the continuous evaluation of proximity relations among moving objects. These relations express whether a given set of ...