The performance of irregular applications on modern computer systems is hurt by the wide gap between CPU and memory speeds because these applications typically underutilize multi-...
John M. Mellor-Crummey, David B. Whalley, Ken Kenn...
Growing demand for high performance in embedded systems is creating new opportunities for Instruction-Level Parallelism ILP techniques that are traditionally used in high perform...
Daniel A. Connors, Jean-Michel Puiatti, David I. A...
With the emergence of global market and virtual enterprises, coordination of business processes in dispersed organizations by distributed workflow execution will take on more impo...
With the advancements in realtime ray tracing and new global illumination algorithms we are now able to render the most important illumination effects at interactive rates. One of...
DTA (Decoupled Threaded Architecture) is designed to exploit fine/medium grained Thread Level Parallelism (TLP) by using a distributed hardware scheduling unit and relying on exi...