Abstract—Supporting Distributed Shared Memory (DSM) is essential for multi-core Network-on-Chips for the sake of reusing huge amount of legacy code and easy programmability. We p...
Xiaowen Chen, Zhonghai Lu, Axel Jantsch, Shuming C...
We present a new deterministic sorting algorithm that interleaves the partitioning of a sample sort with merging. Sequentially, it sorts n elements in O(n log n) time cache-oblivi...
While memory-safe and type-safe languages have been available for many years, the vast majority of software is still implemented in type-unsafe languages such as C/C++. Despite ma...
Babak Salamat, Andreas Gal, Todd Jackson, Karthike...
The algorithms in the current sequential numerical linear algebra libraries (e.g. LAPACK) do not parallelize well on multicore architectures. A new family of algorithms, the tile a...
Emmanuel Agullo, Henricus Bouwmeester, Jack Dongar...
As the effective limits of frequency and instruction level parallelism have been reached, the strategy of microprocessor vendors has changed to increase the number of processing ...
Geoffrey Blake, Ronald G. Dreslinski, Trevor N. Mu...