This paper discusses the design and the implementation of the LU factorization routines included in the Heterogeneous ScaLAPACK library, which is built on top of ScaLAPACK. These ...
Ravi Reddy Manumachu, Alexey L. Lastovetsky, Pedro...
Serial arithmetic uses less hardware than parallel arithmetic. Serial floating point (FP) is slower than parallel FP. The Logarithmic Number System (LNS) simplifies operations, ...
Soft-core processors exploit the flexibility of Field Programmable Gate Arrays (FPGAs) to allow a system designer to customize the processor to the needs of a target application....
Franjo Plavec, Blair Fort, Zvonko G. Vranesic, Ste...
The behavior and performance of MPI non-blocking message passing operations are sensitive to implementation specifics as they are heavily dependant on available system level buff...
The Manticore project is an effort to design and implement a new functional language for parallel programming. Unlike many earlier parallel languages, Manticore is a heterogeneous...
Matthew Fluet, Nic Ford, Mike Rainey, John H. Repp...