Data prefetching has been widely used in the past as a technique for hiding memory access latencies. However, data prefetching in multi-threaded applications running on chip multi...
Dhruva Chakrabarti, Mahmut T. Kandemir, Mustafa Ka...
Partitioned Global Address Space (PGAS) languages combine the programming convenience of shared memory with the locality and performance control of message passing. One such langu...
Katherine A. Yelick, Dan Bonachea, Wei-Yu Chen, Ph...
The rapid revolution in microprocessor chip architecture due to multicore technology is presenting unprecedented challenges to the application developers as well as system softwar...
A multi-cabinet implementation of a combined input and crosspoint queued (CICQ) switch introduces a large RTT latency between the line cards and switch fabric, requiring a large cr...
Modern peer-to-peer file sharing systems distribute large files among peers using block partitioning. Blocks can be redistributed by a peer even before the whole file is available...
Christian Ortolf, Christian Schindelhauer, Arne Va...