Scalability to large number of processes is one of the weaknesses of current MPI implementations. Standard implementations are able to scale to hundreds of nodes, but no beyond th...
Felix Freitag, Jordi Caubet, Montse Farreras, Toni...
Scalable busy-wait synchronization algorithms are essential for achieving good parallel program performance on large scale multiprocessors. Such algorithms include mutual exclusio...
Robert W. Wisniewski, Leonidas I. Kontothanassis, ...
Failures of any type are common in current datacenters, partly due to the higher scales of the data stored. As data scales up, its availability becomes more complex, while differe...
Nicolas Bonvin, Thanasis G. Papaioannou, Karl Aber...
GPU-based heterogeneous clusters continue to draw attention from vendors and HPC users due to their high energy efficiency and much improved single-node computational performance...
We consider approximate join processing over data streams when memory limitations cause incoming tuples to overflow the available space, precluding exact processing. Selective evi...