Sciweavers

WMPI
2004
ACM

A low cost, multithreaded processing-in-memory system

13 years 10 months ago
A low cost, multithreaded processing-in-memory system
This paper discusses die cost vs. performance tradeoffs for a PIM system that could serve as the memory system of a host processor. For an increase of less than twice the cost of a commodity DRAM part, it is possible to realize a performance speedup of nearly a factor of 4 on irregular applications. This cost efficiency derives from developing a custom multithreaded processor architecture and implementation style that is well-suited for embedding in a memory. Specifically, it takes advantage of the low latency and high row bandwidth to both simplify processor design—reducing area—as well as to improve processing throughput. To support our claims of cost and performance, we have used simulation, analysis of existing chips, and also designed and fully implemented a prototype chip, PIM Lite.
Jay B. Brockman, Shyamkumar Thoziyoor, Shannon K.
Added 30 Jun 2010
Updated 30 Jun 2010
Type Conference
Year 2004
Where WMPI
Authors Jay B. Brockman, Shyamkumar Thoziyoor, Shannon K. Kuntz, Peter M. Kogge
Comments (0)