This paper proposes a parallel cycle-accurate microarchitectural simulator which efficiently executes its workload by splitting the simulation process along time-axis into many in...
Abstract. SKaMPI is a benchmark for MPI implementations. Its purpose is the detailed analysis of the runtime of individual MPI operations and comparison of these for di erent imple...
Ralf Reussner, Peter Sanders, Lutz Prechelt, Matth...
DryadLINQ is a system and a set of language extensions that enable a new programming model for large scale distributed computing. It generalizes previous execution environments su...
Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Bud...
Simultaneous Multithreading (SMT) processors achieve high processor throughput at the expense of single-thread performance. This paper investigates resource allocation policies fo...
Silicon technology will continue to provide an exponential increase in the availability of raw transistors. Effectively translating this resource into application performance, how...
Steven Swanson, Ken Michelson, Andrew Schwerin, Ma...