This paper describes a proposal for a set of Parallel Basic Linear Algebra Subprograms PBLAS. The PBLAS are targeted at distributed vector-vector, matrix-vector and matrixmatrix...
Jaeyoung Choi, Jack Dongarra, Susan Ostrouchov, An...
Standard concurrency control mechanisms offer a trade-off: Transactional memory approaches maximize concurrency, but suffer high overheads and cost for retrying in the case of act...
Yannis Smaragdakis, Anthony Kay, Reimer Behrends, ...
In recent years, there has been a cross-fertilization of ideas between computational neuroscience models of the operation of the neocortex and artificial intelligence models of mac...
John Thornton, Jolon Faichney, Michael Blumenstein...
This paper explores the design space of MMU caches that accelerate virtual-to-physical address translation in processor architectures, such as x86-64, that use a radix tree page t...
With the current developments in CPU implementations, it becomes obvious that ever more parallel multicore systems will be used even in embedded controllers that require real-time...