The increasing gap between processor and memory performance has led to new architectural models for memory-intensive applications. In this paper, we use a set of memory-intensive ...
Brian R. Gaeke, Parry Husbands, Xiaoye S. Li, Leon...
Short vector SIMD instructions on recent microprocessors, such as SSE on Pentium III and 4, speed up code but are a major challenge to software developers. We present a compiler t...
Future high-end computers will offer great performance improvements over today’s machines, enabling applications of far greater complexity. However, designers must solve the cha...
Guang R. Gao, Kevin B. Theobald, Ziang Hu, Haiping...
Efficient load balancing algorithms are the key to many efficient parallel applications. Until now, research in this area has mainly been focusing on homogeneous schemes. Howeve...