In the era of multicores, many applications that tend to require substantial compute power and data crunching (aka Throughput Computing Applications) can now be run on desktop PCs...
Chi-Keung Luk, Ryan Newton, William Hasenplaugh, M...
Software pipelining is a loop optimization that overlaps the execution of several iterations of a loop to expose more instruction-level parallelism. It can result in first-class p...
Compilers use register coalescing to avoid generating code for copy instructions. For architectures with register aliasing such as x86, Smith, Ramsey, and Holloway (2004) presented...
—The implementation and optimization of collective communication operations is an important field of active research. Such operations directly influence application performance...
Torsten Hoefler, Christian Siebert, Andrew Lumsdai...
Packet filters play an essential role in traffic management and security management on the Internet. In order to create software-based packet filters that are fast enough to work...