GPGPUs have recently emerged as powerful vehicles for generalpurpose high-performance computing. Although a new Compute Unified Device Architecture (CUDA) programming model from N...
A program analysis tool can play an important role in helping users understand and improve OpenMP codes. Dragon is a robust interactive program analysis tool based on the Open64 co...
The Cell processor is a heterogeneous multi-core processor with one Power Processing Engine (PPE) core and eight Synergistic Processing Engine (SPE) cores. Each SPE has a directly...
Kevin O'Brien, Kathryn M. O'Brien, Zehra Sura, Ton...
Abstract. This paper describes performance of OSCAR multigrain parallelizing compiler on various SMP servers, such as IBM pSeries 690, Sun Fire V880, Sun Ultra 80, NEC TX7/i6010 an...
Kazuhisa Ishizaka, Takamichi Miyamoto, Jun Shirako...
Data locality is critical to achievinghigh performance on large-scale parallel machines. Non-local data accesses result in communication that can greatly impact performance. Thus ...