Detecting races is important for debugging shared-memory parallel programs, because the races result in unintended nondeterministic executions of the programs. Previous on-the- y t...
Abstract-This paper describes the design and verification of a high-performance asynchronous differential equation solver benchmark circuit. The design has low control overhead whi...
Kenneth Y. Yun, Ayoob E. Dooply, Julio Arceo, Pete...
New file systems are critical to obtain good I/O performance on large multiprocessors. Several researchers have suggested the use of collective file-system operations, in which ...
- We present techniques for exploiting parallelism extracted from loops on an MIMD system. Parallelism is exploited through parallel execution of instructions on multiple processor...
Coarse Grained Reconfigurable Array (CGRA) architectures give high throughput and data reuse for regular algorithms while providing flexibility to execute multiple algorithms on th...