Media applications are characterized by large amounts of available parallelism, little data reuse, and a high computation to memory access ratio. While these characteristics are p...
Scott Rixner, William J. Dally, Ujval J. Kapasi, B...
Abstract— The migration away from power-hungry, speculative execution procesors towards manycore architectures is good news for the embedded and real-time systems community. Comm...
This paper describes a novel approach to generate an optimized schedule to run threads on distributed shared memory (DSM) systems. The approach relies upon a binary instrumentatio...
Narrow values that can be represented by less number of bits than the full machine width occur very frequently in programs. On the other hand, clustering mechanisms enable cost- a...
Osman S. Unsal, Oguz Ergin, Xavier Vera, Antonio G...
Abstract. IBM is currently developing the new line of BlueGene/L supercomputers. The top-of-the-line installation is planned to be a 65,536 processors system featuring a peak perfo...
Franz Franchetti, Stefan Kral, Juergen Lorenz, Mar...