We propose an organization for the on-chip memory system of a chip multiprocessor, in which 16 processors share a 16MB pool of 256 L2 cache banks. The L2 cache is organized as a n...
Jaehyuk Huh, Changkyu Kim, Hazim Shafi, Lixin Zhan...
Single-particle 3D reconstruction from cryo-electron microscopy (cryo-EM) images is a kernel application of biological molecules analysis, as the computational requirement of whic...
The emergence of power as a first-class design constraint has fueled the proposal of a growing number of run-time power optimizations. Many of these optimizations trade-off power...
This paper proposes CADRE (Collaborative Allocation and Deallocation of Replicas with Efficiency), a dynamic replication scheme for improving the typically low data availability ...
This paper proposes a high performance least square solver on FPGAs using the Cholesky decomposition method. Our design can be realized by iteratively adopting a single triangular...
Depeng Yang, Gregory D. Peterson, Husheng Li, Junq...