Consider a multithreaded parallel application running inside a multicore virtual machine context that is itself hosted on a multi-socket multicore physical machine. How should the...
With the widespread use of BLAST, many parallel versions of BLAST on cluster systems are announced, but little work has been done for the parallel execution in the search for indi...
In this paper, we present a methodology for low-cost and rapid context switch for multithreaded embedded processors with realtime guarantees. Context-switch, which involves saving...
We consider the filter decomposition problem in supporting coarse-grained pipelined parallelism. This form of parallelism is suitable for data-driven applications in scenarios wh...
In this paper we compare the performance of area equivalent small, medium, and large-scale multithreaded chip multiprocessors (CMTs) using throughput-oriented applications. We use...