As supercomputers are being built from an ever increasing number of processing elements, the effort required to achieve a substantial fraction of the system peak performance is con...
Simulating chip-multiprocessor systems (CMP) can take a long time. For single-threaded workloads, earlier work has shown the utility of phase analysis, that is identification of r...
Jeffrey Namkung, Dohyung Kim, Rajesh K. Gupta, Igo...
Dynamic binary instrumentation systems are used to inject or modify arbitrary instructions in existing binary applications; several such systems have been developed over the past ...
In processors with several levels of hardware resource sharing, like CMPs in which each core is an SMT, the scheduling process becomes more complex than in processors with a singl...
Petar Radojkovic, Vladimir Cakarevic, Javier Verd&...
The effect of the operating system on application performance is an increasingly important consideration in high performance computing. OS kernel measurement is key to understandi...
Aroon Nataraj, Allen D. Malony, Sameer Shende, Ala...