Thread-Level Speculation (TLS) allows us to automatically parallelize general-purpose programs by supporting parallel execution of threads that might not actually be independent. ...
J. Gregory Steffan, Christopher B. Colohan, Antoni...
Ethernet line rates are projected to reach 100 Gbits/s by as soon as 2010. While in principle suitable for high performance clustered and parallel applications, Ethernet requires ...
A key step in program optimization is the determination of optimal values for code optimization parameters such as cache tile sizes and loop unrolling factors. One approach, which...
Abstract. Component architectures provide a useful framework for developing an extensible and maintainable code base upon which largescale software projects can be built. Component...
Brian Barrett, Jeffrey M. Squyres, Andrew Lumsdain...
While previous CPU- or memory-centric load balancing schemes are capable of achieving the effective usage of global CPU and memory resources in a cluster system, the cluster exhib...
Xiao Qin, Hong Jiang, Yifeng Zhu, David R. Swanson