- This paper presents a DSP core for multi-core media SoC, which is optimized to execute a set of signal processing tasks very efficiently. The fully-programmable core has a data-c...
Abstract. In a multicore transactional memory (TM) system, concurrent execution threads interact and interfere with each other through shared memory. The less interference a progra...
MPI Alltoall is one of the most communication intense collective operation used in many parallel applications. Recently, the supercomputing arena has witnessed phenomenal growth o...
Rahul Kumar, Amith R. Mamidala, Dhabaleswar K. Pan...
Abstract—This paper addresses the problem of execution time estimation for tasks in a software pipeline independent of the application structure or the underlying architecture. A...
With applications becoming larger and the increasing load on high performance systems, it is important to tackle the I/O bottleneck problem from several angles. It is not only ess...
Murali Vilayannur, Mahmut T. Kandemir, Anand Sivas...