Efficient performance tuning of parallel programs is often hard. In this paper we describe an approach that uses a uni-processor execution of a multithreaded program as reference ...
Massively parallel SIMD array architectures are making their way into embedded processors. In these architectures, a number of identical processing elements having small private st...
Anton Lokhmotov, Benedict R. Gaster, Alan Mycroft,...
In this paper we report on features added to a parallel debugger to simplify the debugging of message passing programs. These features include replay, setting consistent breakpoin...
The ability to quickly predict the throughput of a TCP transfer between a client and a server, or between peers, has wide application in scientific computing and commercial compu...
The aim of this paper is to promote the idea of developing reusable coordination patterns for parallel computing, i.e. customizable components from which parallel applications can ...