This paper introduces a programming interface called PARRAY (or Parallelizing ARRAYs) that supports system-level succinct programming for heterogeneous parallel systems like GPU c...
Transactional Memory (TM) simplifies parallel programming by supporting atomic and isolated execution of user-identified tasks. To date, TM programming has required the use of l...
Woongki Baek, Chi Cao Minh, Martin Trautmann, Chri...
We have developed an environment, based upon robust, existing, open source software, for tuning applications written using MPI, OpenMP or both. The goal of this effort, which inte...
Clusters of high-end workstations and PCs are currently used in many application domains to perform large-scale computations or as scalable servers for I/O bound tasks. Although c...
Understanding why the performance of a multithreaded program does not improve linearly with the number of cores in a sharedmemory node populated with one or more multicore process...