Computer architects utilize simulation tools to evaluate the merits of a new design feature. The time needed to adequately evaluate the tradeoffs associated with adding any new fe...
Kaushal Sanghai, Ting Su, Jennifer G. Dy, David R....
We describe a programming framework for high performance clusters with various hardware accelerators. In this framework, users can utilize the available heterogeneous resources pr...
Kuen Hung Tsoi, Anson H. T. Tse, Peter Pietzuch, W...
— Modern CPUs operate at GHz frequencies, but the latencies of memory accesses are still relatively large, in the order of hundreds of cycles. Deeper cache hierarchies with large...
Konrad Malkowski, Greg M. Link, Padma Raghavan, Ma...
Although SIMD (Single Instruction stream Multiple Data stream) parallel computers have existed for decades, it is only in the past few years that a new version of SIMD has evolved...
This paper presents a multiprocessor system on FPGA that adopts Direct Memory Access (DMA) mechanisms to move data between the external memory and the local memory of each process...