Sparse LU factorization with partial pivoting is important for many scienti c applications and delivering high performance for this problem is di cult on distributed memory machin...
Today, it is possible to associate multiple CPUs and multiple GPUs in a single shared memory architecture. Using these resources efficiently in a seamless way is a challenging issu...
—We present the design and implementation of a parallel exact inference algorithm on the Cell Broadband Engine (Cell BE). Exact inference is a key problem in exploring probabilis...
Load distribution is an essential factor to parallel efficiency of numerical simulations that are based on spatial grids, especially on clusters of symmetric multiprocessors (SMPs...
Huaien Gao, Andreas Schmidt, Amitava Gupta, Peter ...
Abstract This paper presents an embedded system design toolchain for automatic generation of parallel code runnable on symmetric multiprocessor systems from an initial sequential s...
Fabrizio Ferrandi, Luca Fossati, Marco Lattuada, G...