In this paper we describe the design, implementation and experimental evaluation of a technique for operating system schedulers called processor pool-based scheduling [51]. Our tec...
The events occurring in the execution of a distributed or parallel application are related by a partial, rather than a total, order. We have developed prototype software that coll...
Resource sharing can cause unfair and unpredictable performance of concurrently executing applications in Chip-Multiprocessors (CMP). The shared last-level cache is one of the mos...
With the ever-increasing transistor variability in CMOS technology, it is essential to integrate variation-aware performance analysis into the task allocation and scheduling proce...
Sparse linear solvers account for much of the execution time in many high-performance computing (HPC) applications, and not every solver works on all problems. Hence choosing a su...