Compilation has a long history of translating a programmer's human-readable code into machine instructions designed to make good use of a specific target computer. In this pa...
—Since many embedded systems execute a predefined set of programs, tuning system components to application programs and data is the approach chosen by many design techniques to o...
Tw o parallel programming models represented b y OpenMP and MPI are compared for PDE solvers based on regular sparse numerical operators. As a typical representative of such an app...
Current hardware transactional memory systems seek to simplify parallel programming, but assume that large transactions are rare, so it is acceptable to penalize their performance...
Jayaram Bobba, Neelam Goyal, Mark D. Hill, Michael...
Graphics processors (GPUs) provide a vast number of simple, data-parallel, deeply multithreaded cores and high memory bandwidths. GPU architectures are becoming increasingly progr...
Shuai Che, Michael Boyer, Jiayuan Meng, David Tarj...