
Tsinghua U.
14 years 4 months ago
Zero-content augmented caches
It has been observed that some applications manipulate large amounts of null data. Moreover these zero data often exhibit high spatial locality. On some applications more than 20%...
Julien Dusser, Thomas Piquet, André Seznec
Tsinghua U.
14 years 4 months ago
Adagio: making DVS practical for complex HPC applications
Power and energy are first-order design constraints in high performance computing. Current research using dynamic voltage scaling (DVS) relies on trading increased execution time...
Barry Rountree, David K. Lowenthal, Bronis R. de S...
Tsinghua U.
14 years 4 months ago
Tuned and wildly asynchronous stencil kernels for hybrid CPU/GPU systems
We describe heterogeneous multi-CPU and multi-GPU implementations of Jacobi’s iterative method for the 2-D Poisson equation on a structured grid, in both single- and doublepreci...
Sundaresan Venkatasubramanian, Richard W. Vuduc
Tsinghua U.
14 years 4 months ago
Towards 100 gbit/s ethernet: multicore-based parallel communication protocol design
Ethernet line rates are projected to reach 100 Gbits/s by as soon as 2010. While in principle suitable for high performance clustered and parallel applications, Ethernet requires ...
Stavros Passas, Kostas Magoutis, Angelos Bilas
Tsinghua U.
14 years 4 months ago
Computer generation of fast fourier transforms for the cell broadband engine
The Cell BE is a multicore processor with eight vector accelerators (called SPEs) that implement explicit cache management through direct memory access engines. While the Cell has...
Srinivas Chellappa, Franz Franchetti, Markus P&uum...