Optimal network performance is critical to efficient parallel scaling for communication-bound applications on large machines. With wormhole routing, no-load latencies do not increa...
Abhinav Bhatele, Eric J. Bohm, Laxmikant V. Kal&ea...
This paper presents a new sequential decoding algorithm based on dynamic searching strategy to improve decoding efficiency. The searching strategy is to exploit both sorting and p...
This paper explores timing anomalies in WCET analysis. Timing anomalies add to the complexity of WCET analysis and make it hard to apply divide-and-conquer strategies to simplify ...
Raimund Kirner, Albrecht Kadlec, Peter P. Puschner
Recent work has shown that multithreaded workloads running in execution-driven, full-system simulation environments cannot use instructions per cycle (IPC) as a valid performance ...
One purpose of the end-user tools described in this paper is to give users a graphical representation of performance information that has been gathered by instrumenting an applica...
Kevin S. London, Jack Dongarra, Shirley Moore, Phi...