This paper describes the parallelization of a commercial molecular dynamics simulation code, GROMOS96, on a SCI (Scalable Coherent Interface) interconnected PC cluster. The underly...
Emergence of chip multiprocessor systems has dramatically increased the performance potential of computer systems. Since the amount of exploited parallelism is directly influenced ...
In order to produce MPI applications that perform well on today’s parallel architectures, programmers need effective tools for collecting and analyzing performance data. Because ...
Shirley Moore, David Cronk, Kevin S. London, Jack ...
We present a high performance algorithm for multiplying sparse distributed polynomials using a multicore processor. Each core uses a heap of pointers to multiply parts of the poly...
: A powerful and widely-used method for analyzing the performance behavior of parallel programs is event tracing. When an application is traced, performancerelevant events, such as...
Felix Wolf, Felix Freitag, Bernd Mohr, Shirley Moo...