Recent research on processor microarchitecture suggests using instruction criticality as a metric to guide hardware control policies. Fields et al. [3, 4] have proposed a directed...
Abstract. We reexamine the limits of parallelism available in programs, using runtime reconstruction of program data-flow graphs. While limits of parallelism have been examined in...
An asynchronous work-stealing implementation of dynamic load balance is implemented using Unified Parallel C (UPC) and evaluated using the Unbalanced Tree Search (UTS) benchmark ...
This paper introduces the queue-read, queue-write (qrqw) parallel random access machine (pram) model, which permits concurrent reading and writing to shared memory locations, but ...
Phillip B. Gibbons, Yossi Matias, Vijaya Ramachand...
Multithreaded parallel system with software Distributed Shared Memory (DSM) is an attractive direction in cluster computing. In these systems, distributing workloads and keeping t...