A traditional fixed-function graphics accelerator has evolved into a programmable general-purpose graphics processing unit over the last few years. These powerful computing cores...
Despite the large research efforts in the SW–DSM community, this technology has not yet been adapted widely for significant codes beyond benchmark suites. One of the reasons co...
As microarchitectural and system complexity grows, comprehending system behavior becomes increasingly difficult, and often requires obtaining and sifting through voluminous event ...
Martin Schulz, Brian S. White, Sally A. McKee, Hsi...
This paper proposes the use of microprocessor performance counters for online measurement of complete system power consumption. While past studies have demonstrated the use of per...
With the advent of ubiquitous multi-core architectures, a major challenge is to simplify parallel programming. One way to tame one of the main sources of programming complexity, n...
Luis Ceze, Pablo Montesinos, Christoph von Praun, ...