Clusters have made the jump from lab prototypes to fullfledged production computing platforms. The number, variety, and specialized configurations of these machines are increasi...
VLIW architectures are popular in embedded systems because they offer high-performance processing at low cost and energy. The major problem with traditional VLIW designs is that t...
Hongtao Zhong, Kevin Fan, Scott A. Mahlke, Michael...
The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. As great progress was made in hardware and software technolo...
While microprocessor designers turn to multicore architectures to sustain performance expectations, the dramatic increase in parallelism of such architectures will put substantial...
Susmit Biswas, Diana Franklin, Alan Savage, Ryan D...
Communication in cache-coherent distributed shared memory (DSM) often requires invalidating (or writing back) cached copies of a memory block, incurring high overheads. This paper...