Data locality is critical to achievinghigh performance on large-scale parallel machines. Non-local data accesses result in communication that can greatly impact performance. Thus ...
Previous studies have shown that wireless DSP algorithms exhibit high levels of data level parallelism (DLP). Commercial and research work in the field of software defined radio...
Mark Woh, Yuan Lin, Sangwon Seo, Trevor N. Mudge, ...
Modern embedded systems for image processing involve increasingly complex levels of functionality under real-time and resourcerelated constraints. As this complexity increases, th...
The Cell Broadband Engine (Cell BE) is a heterogeneous multi-core processor specifically designed to exploit thread-level parallelism. Its memory model comprehends a common shared ...
In this paper we propose a design methodology for low-power, high-performance, process-variation tolerant architecture for arithmetic units. The novelty of our approach lies in th...