In this paper, we present an efficient algorithm, called CASS-II, for task clustering without task duplication. Unlike the DSC algorithm, which is empirically the best known algor...
This paper argues for an alternative way of designing coordination models for parallel and distributed environments based on a complete symmetry between and decoupling of producers...
Numerical applications frequently contain nested loop structures that process large arrays of data. The execution of these loop structures often produces memory preference pattern...
Yoji Yamada, John Gyllenhall, Grant Haab, Wen-mei ...
We present a linear programming-based method for nding \gadgets", i.e., combinatorial structures reducing constraints of one optimization problem to constraints of another. A...
Luca Trevisan, Gregory B. Sorkin, Madhu Sudan, Dav...
Abstract. Today’s large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data struc...
Timo Heister, Martin Kronbichler, Wolfgang Bangert...