In this article, we present a parallel implementation of a 1024 point Fast Fourier Transform (FFT) operating with a subthreshold supply voltage, which is below the voltage that tur...
The Cell BE is a multicore processor with eight vector accelerators (called SPEs) that implement explicit cache management through direct memory access engines. While the Cell has...
Srinivas Chellappa, Franz Franchetti, Markus P&uum...
Abstract. Designing and tuning parallel applications with MPI, particularly at large scale, requires understanding the performance implications of different choices of algorithms ...
Torsten Hoefler, William Gropp, Rajeev Thakur, Jes...
—The Spectral Hash algorithm is one of the Round 1 candidates for the SHA-3 family, and is based on spectral arithmetic over a finite field, involving multidimensional discrete...
Two problems from the recently published “NAS Parallel Benchmarks” have been implemented on three advanced parallel computer systems. These two benchmarks are the following: (...