Since there is generally insufficient instruction level parallelism within a single basic block, higher performance is achieved by speculatively scheduling operations in superbloc...
In order to achieve practical efficient execution on a parallel architecture, a knowledge of the data dependencies related to the application appears as the key point for building...
We present the Data Parallel Fortran (DPF) benchmark suite, a set of data parallel Fortran codes forevaluatingdata parallel compilers appropriatefor any target parallel architectu...
Y. Charlie Hu, S. Lennart Johnsson, Dimitris Kehag...
This paper presents, QuickStore, a memory-mapped storage system for persistent C++ built on top of the EXODUS Storage Manager. QuickStore provides fast access to in-memory objects...
Autotuning technology has emerged recently as a systematic process for evaluating alternative implementations of a computation, in order to select the best-performing solution for...
Jaewook Shin, Mary W. Hall, Jacqueline Chame, Chun...