Efficient execution of well-parallelized applications is central to performance in the multicore era. Program analysis tools support the hardware and software sides of this effor...
Collective communication is very useful for parallel applications, especially those in which matrix and vector data structures need to be manipulated by a group of processes. This...
Rafael Ennes Silva, Delcino Picinin, Marcos E. Bar...
The LAPACK software project currently under development is intended to provide a portable linear algebra library for high performance computers. LAPACK will make use of the Level 1...
Computer systems increasingly rely on dynamic, phasebased system management techniques, in which system hardware and software parameters may be altered or tuned at runtime for dif...
Cray X1 Fortran and C/C++ compilers provide a number of loop transformations, notably vectorization and multistreaming, in order to exploit the multistreaming processor (MSP) hard...