An asynchronous work-stealing implementation of dynamic load balance is implemented using Unified Parallel C (UPC) and evaluated using the Unbalanced Tree Search (UTS) benchmark ...
This paper presents a two-part study on managing distributed NUCA (Non-Uniform Cache Architecture) L2 caches in a future manycore processor to obtain high singlethread program per...
Parallel I/O plays an increasingly important role in today’s data intensive computing applications. While much attention has been paid to parallel read performance, most of this...
Abstract— We examine the problem of parallelizing the inferencing process for OWL knowledge-bases. A key challenge in this problem is partitioning the computational workload of t...
On systems with multi-core processors, the memory access scheduling scheme plays an important role not only in utilizing the limited memory bandwidth but also in balancing the pro...