Modern computers have taken advantage of the instruction-level parallelism (ILP) available in programs with advances in both architecture and compiler design. Unfortunately, large...
Fine-grained multithreading based on a natural model, such as dataflow model, is promising in achieving high efficiency and high programming productivity. In this paper, we disc...
Tiling has proven to be an effective mechanism to develop high performance implementations of algorithms. Tiling can be used to organize computations so that communication costs i...
Ganesh Bikshandi, Jia Guo, Daniel Hoeflinger, Gheo...
Abstract--Determinant Quantum Monte Carlo (DQMC) simulation has been widely used to reveal macroscopic properties of strong correlated materials. However, parallelization of the DQ...
We propose a cluster-based web server where a few computing nodes are separately reserved for high-performance computing applications, such as multimedia, SSL, and CGI. As an exam...