Understanding why the performance of a multithreaded program does not improve linearly with the number of cores in a sharedmemory node populated with one or more multicore process...
This paper presents the design and implementation of a compiler algorithm that effectively optimizes programs for energy usage using dynamic voltage scaling (DVS). The algorithm ...
Divide and conquer algorithms are a good match for modern parallel machines: they tend to have large amounts of inherent parallelism and they work well with caches and deep memory...
Array inlining expands the concepts of object inlining to arrays. Groups of objects and arrays that reference each other are placed consecutively in memory so that their relative ...
Java exception checks are designed to ensure that any faulting instruction causing a hardware exception does not terminate the program abnormally. These checks, however, impose so...