This paper presents an algorithm to automatically map code on a generic intelligent memory system that consists of a host processor and a simpler memory processor. To achieve high...
An algorithm and implementation is presented to compute the exact arrangement induced by arbitrary algebraic surfaces on a parametrized ring Dupin cyclide. The family of Dupin cyc...
Hardware signatures have been recently proposed as an efficient mechanism to detect conflicts amongst concurrently running transactions in transactional memory systems (e.g., Bulk...
While parallelism and multi-cores are receiving much attention as a major scalability path, customization is another, orthogonal and complementary, scalability path which can targ...
Sami Yehia, Sylvain Girbal, Hugues Berry, Olivier ...
As the number of cores and threads in manycore compute accelerators such as Graphics Processing Units (GPU) increases, so does the importance of on-chip interconnection network des...