In our research project named “Mega-Scale Computing Based on Low-Power Technology and Workload Modeling”, we have been developing a prototype cluster not based on ASIC or FPGA...
Hiding communication latency is an important optimization for parallel programs. Programmers or compilers achieve this by using non-blocking communication primitives and overlappi...
Floating-point Sparse Matrix-Vector Multiplication (SpMXV) is a key computational kernel in scientific and engineering applications. The poor data locality of sparse matrices sig...
With the increasing performance gap between the processor and the memory, the importance of caches is increasing for high performance processors. However, with reducing feature si...
As FPGA densities increase, partitioning-based FPGA placement approaches are becoming increasingly important as they can be used to provide high-quality and computationally scalab...