Control intensive scalar programs pose a very different challenge to highly pipelined supercomputers than vectorizable numeric applications. Function call/return and branch instru...
As the industry moves toward larger-scale chip multiprocessors, the need to parallelize applications grows. High inter-thread communication delays, exacerbated by over-stressed hi...
Ram Rangan, Neil Vachharajani, Adam Stoler, Guilhe...
Very Long Instruction Word (VLIW) architectures exploit instruction level parallelism (ILP) with the help of the compiler to achieve higher instruction throughput with minimal hard...
Memory trace analysis is an important technology for architecture research, system software (i.e., OS, compiler) optimization, and application performance improvements. Many appro...
Yungang Bao, Mingyu Chen, Yuan Ruan, Li Liu, Jianp...
Many architectural ideas that appear to be useful from a hardware standpoint fail to achieve wide acceptance due to lack of compiler support. In this paper we explore the design of...
David Judd, Katherine A. Yelick, Christoforos E. K...