There has recently been increasing interests in using system virtualization to improve the dependability of HPC cluster systems. However, it is not cost-free and may come with som...
Haibo Chen, Rong Chen, Fengzhe Zhang, Binyu Zang, ...
This paper describes a scalable parallel marking technique for garbage collection that does not employ any synchronization operation. To achieve good scalability, two major design...
Due to the extensive requirement of memory and speed for direct numerical simulation (DNS) of channel turbulence, people can only perform DNS at moderate Reynolds number before. W...
The IBM Cyclops-64 (C64) chip employs a multithreaded architecture that integrates a large number of hardware thread units on a single chip. A cellular supercomputer is being deve...
Well designed domain specific languages enable the easy expression of problems, the application of domain specific optimizations, and dramatic improvements in productivity for t...
Jun Cao, Ayush Goyal, Samuel P. Midkiff, James M. ...