— We consider memory subsystem optimizations for improving the performance of sparse scientific computation while reducing the power consumed by the CPU and memory. We first co...
In this paper we study the performance improvements and trade-offs derived from an optimized mapping approach applied on a parametric coarse grained reconfigurable array architect...
Grigoris Dimitroulakos, Michalis D. Galanis, Const...
High performance microprocessors are designed with generalpurpose applications in mind. When it comes to embedded applications, these architectures typically perform controlintens...
1 This paper presents a simple but effective method to reduce on-chip access latency and improve core isolation in CMP Non-Uniform Cache Architectures (NUCA). The paper introduces ...
Javier Merino, Valentin Puente, Pablo Prieto, Jos&...
Partitioned Global Address Space (PGAS) languages provide a unique programming model that can span shared-memory multiprocessor (SMP) architectures, distributed memory machines, o...