Parallel programs that modify shared data in a cachecoherent multiprocessor with a write-invalidate coherence protocol create ownership overhead in the form of ownership acquisiti...
Time skewing is a compile-time optimization that can provide arbitrarily high cache hit rates for a class of iterative calculations, given a sufficient number of time steps and s...
Standard JPEG decompression reconstructs quantized DCT coefficients to the center of the quantization bin. This fails to exploit the nonuniform distribution of the AC coefficien...
The increasing numbers of cores, shared caches and memory nodes within machines introduces a complex hardware topology. High-performance computing applications now have to carefull...
A major challenge in Metacomputing Systems (Computational Grids) is to effectively use their shared resources, such as compute cycles, memory, communication network, and data repo...
Ammar H. Alhusaini, Viktor K. Prasanna, Cauligi S....