In this paper we present the design, implementation and evaluation of a runtime system based on collective I/O techniques for irregular applications. We present two models, namely...
This paper presents a novel technique to perform global optimization of communication and preprocessing calls in the presence of array accesses with arbitrary subscripts. Our sche...
This paper presents an algorithm for fast sorting of large lists using modern GPUs. The method achieves high speed by efficiently utilizing the parallelism of the GPU throughout th...
Abstract—Sequencer, privilege-based, and communication history algorithms are popular approaches to implement total ordering, where communication history algorithms are most suit...
Li Ou, Xubin He, Christian Engelmann, Stephen L. S...
While uniprocessor garbage collection is relatively well understood, experience with collectors for large multiprocessor servers is limited and it is unknown which techniques best ...
C. Richard Attanasio, David F. Bacon, Anthony Cocc...