Sciweavers

NPC
2005
Springer

Performance Modelling and Optimization of Memory Access on Cellular Computer Architecture Cyclops64

13 years 10 months ago
Performance Modelling and Optimization of Memory Access on Cellular Computer Architecture Cyclops64
This paper focuses on the Cyclops64 computer architecture and presents an analytical model and performance simulation results for the preloading and loop unrolling approaches to optimize the performance of SVD (Singular Value Decomposition) benchmark. A performance model for dissecting the total execution cycles is presented. The data preloading using “memcpy” or hand optimized “inline” assembly code, and the loop unrolling approach are implemented and compared with each other in terms of the total number of memory access cycles. The key idea is to preload data from offchip to onchip memory and store the data back after the computation. These approaches can reduce the total memory access cycles and can thus improve the benchmark performance significantly.
Yanwei Niu, Ziang Hu, Kenneth E. Barner, Guang R.
Added 28 Jun 2010
Updated 28 Jun 2010
Type Conference
Year 2005
Where NPC
Authors Yanwei Niu, Ziang Hu, Kenneth E. Barner, Guang R. Gao
Comments (0)