The floating-point multiply-add fused (MAF) unit sets a new trend in the processor design to speed up floatingpoint performance in scientific and multimedia applications. This ...
In the past, efficient parallel algorithms have always been developed specifically for the successive generations of parallel systems (vector machines, shared-memory machines, d...
We compare the performance of systems consisting of one large cluster containing q processors with systems where processors are grouped into k clusters containing u processors eac...
In this article we report on our efforts to test and expand the current state-of-the-art in eigenvalue solvers applied to the field of nanotechnology. We singled out the nonlinea...
Stanimire Tomov, Julien Langou, Andrew Canning, Li...
Performance analysis of high performance systems is a difficult task. Current tools have proven successful in analysis tasks but their implementation is limited in several respects...