Despite a burgeoning demand for parallel programs, the tools available to developers working on shared-memory multicore processors have lagged behind. One reason for this is the l...
Marek Olszewski, Qin Zhao, David Koh, Jason Ansel,...
Modern DRAMs have multiple banks to serve multiple memory requests in parallel. However, when two requests go to the same bank, they have to be served serially, exacerbating the h...
Yoongu Kim, Vivek Seshadri, Donghyuk Lee, Jamie Li...
In this paper we investigate various performance measures of interest, when comparing fast packetswitched networks that operate with either fixed or variable packet sizes. These p...
This paper describes a novel approach to performance analysis for parallel and distributed systems that is based on soft computing. We introduce the concept of performance score re...
Performance auditing is an online optimization strategy that empirically measures the effectiveness of an optimization on a particular code region. It has the potential to greatly...
Jeremy Lau, Matthew Arnold, Michael Hind, Brad Cal...