Memory may be the only system component that is more commoditized than a microprocessor. To simultaneously exploit this and address the impending memory wall, processing in memory...
Arun Rodrigues, Richard C. Murphy, Peter M. Kogge,...
This paper presents our experience mapping OpenMP parallel programming model to the IBM Cyclops-64 (C64) architecture. The C64 employs a many-core-on-a-chip design that integrates...
We consider scheduling problems in which a job consists of components of different types to be processed on m machines. Each machine is capable of processing components of a singl...
As software becomes increasingly complex and difficult to analyze, it is more and more common for developers to use high-level, type-safe, object-oriented (OO) programming langua...
This paper describes a compiler for stream programs that efficiently schedules computational kernels and stream memory operations, and allocates on-chip storage. Our compiler uses...