Abstract. Performance analysis for terascale computing requires a combination of new concepts including distribution, on-line processing and automation. As a foundation for tools r...
Over the last decade, Message Passing Interface (MPI) has become a very successful parallel programming environment for distributed memory architectures such as clusters. However, ...
This paper studies the impact of L2 cache sharing on threads that simultaneously share the cache, on a Chip Multi-Processor (CMP) architecture. Cache sharing impacts threads non-u...
Dhruba Chandra, Fei Guo, Seongbeom Kim, Yan Solihi...
Given the importance of parallel mesh generation in large-scale scientific applications and the proliferation of multilevel SMTbased architectures, it is imperative to obtain ins...
Christos D. Antonopoulos, Xiaoning Ding, Andrey N....
Synchronous Dataflow (SDF) is a powerful analysis tool for regular, cyclic, parallel task graphs. The behaviour of SDF graphs however is static and therefore not always able to ac...