Compiler optimizations are often driven by specific assumptions about the underlying architecture and implementation of the target machine. For example, when targeting shared-mem...
Jack L. Lo, Susan J. Eggers, Henry M. Levy, Sujay ...
Pipelining has been used in the design of many PRAM algorithms to reduce their asymptotic running time. Paul, Vishkin, and Wagener (PVW) used the approach in a parallel implementat...
Abstract--Developing and tuning computational science applications to run on extreme scale systems are increasingly complicated processes. Challenges such as managing memory access...
Philip H. Carns, Robert Latham, Robert B. Ross, Ka...
Profilers play an important role in software/hardware design, optimization, and verification. Various approaches have been proposed to implement profilers. The most widespread app...
Lei Gao, Jia Huang, Jianjiang Ceng, Rainer Leupers...
Abstract. In the field of HPC, the current hardware trend is to design multiprocessor architectures that feature heterogeneous technologies such as specialized coprocessors (e.g., ...