This paper presents COBRA (Continuous Binary ReAdaptation), a runtime binary optimization framework, for multithreaded applications. It is currently implemented on Itanium 2 based...
OpenMP has emerged as a widely accepted standard for writing shared memory programs. Hardware-specific extensions such as data placement are usually needed to improve the scalabi...
In this paper, two tools are presented: an execution driven cache simulator which relates event metrics to a dynamically built-up call-graph, and a graphical front end able to visu...
Josef Weidendorfer, Markus Kowarschik, Carsten Tri...
Abstract— We developed an automated environment to measure the memory access behavior of applications on high performance clusters. Code optimization for processor caches is cruc...
—Software instrumentation is a powerful and flexible technique for analyzing the dynamic behavior of programs. By inserting extra code in an application, it is possible to study...
Alex Skaletsky, Tevi Devor, Nadav Chachmon, Robert...