In this paper, we present techniques and algorithms to improve the performance of various communication patterns on message-passing platforms where, for reasons of safety, user-le...
We present an architecture that features dynamic multithreading execution of a single program. Threads are created automatically by hardware at procedure and loop boundaries and e...
The CPU is one of the major power consumers in a portable computer, and considerable power can be saved by turning off the CPU when it is not doing useful work. In Apple's Ma...
This paper demonstrates how an Instruction Path Coprocessor (I-COP) can be efficiently implemented using the PipeRench reconfigurable architecture. An I-COP is a programmable on-c...
Yuan C. Chou, Pazhani Pillai, Herman Schmit, John ...
Interconnect delays are becoming an increasingly significant part of the critical path delay for circuits implemented in FPGAs. Pipelined interconnects have been proposed to addre...