The algorithms in the current sequential numerical linear algebra libraries (e.g. LAPACK) do not parallelize well on multicore architectures. A new family of algorithms, the tile a...
Emmanuel Agullo, Henricus Bouwmeester, Jack Dongar...
Abstract--The now commonplace multi-core chips have introduced, by design, a deep hierarchy of memory and cache banks within parallel computers as a tradeoff between the user frien...
Recently, there has been strong interest in large-scale simulations of biological spiking neural networks (SNN) to model the human brain mechanisms and capture its inference capabi...
Mohammad A. Bhuiyan, Vivek K. Pallipuram, Melissa ...
Abstract. This paper is motivated by the desire to provide an efficient and scalable software cache implementation of OpenMP on multicore and manycore architectures in general, and...
Chen Chen, Joseph B. Manzano, Ge Gan, Guang R. Gao...
The IBM Cell Broadband Engine (BE) is a multicore processor with a PowerPC host processor (PPE) and 8 synergic processor engines (SPEs). The Cell BE architecture is designed to im...
Tamer F. Rabie, Hashir Karim Kidwai, Fadi N. Sibai