We investigate the computational complexity of the task of detecting dense regions of an unknown distribution from un-labeled samples of this distribution. We introduce a formal l...
The increasing numbers of cores, shared caches and memory nodes within machines introduces a complex hardware topology. High-performance computing applications now have to carefull...
This paper presents an investigation into local mechanisms and scheduling policies that allow guest processes to efficiently exploit otherwise-idle workstation resources. Unlike t...
Kyung Dong Ryu, Jeffrey K. Hollingsworth, Peter J....
Abstract. Dynamic program optimization is the only recourse for optimizing compilers when machine and program parameters necessary for applying an optimization technique are unknow...
The goal of this paper is to gain insight into the relative performance of communication mechanisms as bisection bandwidth and network latency vary. We compare shared memory with ...
Frederic T. Chong, Rajeev Barua, Fredrik Dahlgren,...