DRAM vendors have traditionally optimized the cost-perbit metric, often making design decisions that incur energy penalties. A prime example is the overfetch feature in DRAM, wher...
Aniruddha N. Udipi, Naveen Muralimanohar, Niladris...
As the effective limits of frequency and instruction level parallelism have been reached, the strategy of microprocessor vendors has changed to increase the number of processing ...
Geoffrey Blake, Ronald G. Dreslinski, Trevor N. Mu...
Abstract. We present new performance models and a new, more compact data structure for cache blocking when applied to the sparse matrixvector multiply (SpM×V) operation, y ← y +...
Rajesh Nishtala, Richard W. Vuduc, James Demmel, K...
In light of advances in processor and networking technology, especially the emergenceof networkattached disks,the traditional clientserver architecture of file systems has become...
Jiong Yang, Wei Wang 0010, Richard R. Muntz, Silvi...
Multimedia processing on embedded devices requires an architecture that leads to high performance, low power consumption, reduced design complexity, and small code size. In this p...