Clustered microarchitectures are an effective approach to reducing the penalties caused by wire delays inside a chip. Current superscalar processors have in fact a two-cluster mic...
Ramon Canal, Joan-Manuel Parcerisa, Antonio Gonz&a...
Creating replicas of frequently accessed objects across a read-intensive network can result in large bandwidth savings which, in turn, can lead to reduction in user response time....
Current VLSI technology allows more than two wiring layers and the number is expected to rise in future. In this paper, we show that, by designing VLSI layouts directly for an L-l...
Chi-Hsiang Yeh, Emmanouel A. Varvarigos, Behrooz P...
This paper describes extensions to OpenMP that implement data placement features needed for NUMA architectures. OpenMP is a collection of compiler directives and library routines ...
John Bircsak, Peter Craig, RaeLyn Crowell, Zarka C...
Networks of Workstations (NOW) have become an attractive alternative platform for high performance computing. Due to the commodity nature of workstations and interconnects and due...
Mohammad Banikazemi, Vijay Moorthy, Dhabaleswar K....