We believe that future many-core architectures should support a simple and scalable way to execute many threads that are generated by parallel programs. A good candidate to impleme...
Abstract— In this paper we address the problem of the architectural exploration from the energy/performance point of view of a VLIW processor for embedded systems. We also consid...
Motivated by the need for high throughput sphere decoding for multipleinput-multiple-output (MIMO) communication systems, we propose a parallel depth-first sphere decoding (PDSD)...
This paper presents a new approach for analyzing the performance of grid scheduling algorithms for tasks with dependencies. Finding the optimal procedures for DAG scheduling in Gr...
We present a customizable simulator called netsim for high-performance point-to-point workstation networks that is accurate enough to be used for application-level performance ana...
Mustafa Uysal, Anurag Acharya, Robert Bennett, Joe...