Recurrent neural networks are theoretically capable of learning complex temporal sequences, but training them through gradient-descent is too slow and unstable for practical use i...
The MPI Standard supports derived datatypes, which allow users to describe noncontiguous memory layout and communicate noncontiguous data with a single communication function. Thi...
Surendra Byna, William D. Gropp, Xian-He Sun, Raje...
Memory dependence prediction allows out-of-order issue processors to achieve high degrees of instruction level parallelism by issuing load instructions at the earliest time withou...
This paper presents an algorithm to automatically map code on a generic intelligent memory system that consists of a host processor and a simpler memory processor. To achieve high...
We propose a type system for a programming language with memory allocation/deallocation primitives, which prevents memory-related errors such as double-frees and memory leaks. The ...