The Sort operation is a core part of many critical applications. Despite the large efforts to parallelize it, the fact that it suffers from high data-dependencies vastly limits it...
Layali K. Rashid, Wessam Hassanein, Moustafa A. Ha...
Optimizing the performance of a multi-core microprocessor within a power budget has recently received a lot of attention. However, most existing solutions are centralized and cann...
Unlike familiar macroblock-based in-loop deblocking filter in H.264, the filters of VC-1 perform all horizontal edges (for in-loop deblocking filtering) or vertical edges (for ove...
Sparse Matrix-Vector multiplication (SpMV) is a very challenging computational kernel, since its performance depends greatly on both the input matrix and the underlying architectur...
Vasileios Karakasis, Georgios I. Goumas, Nectarios...
This paper identifies node affinity as an important property for scalable general-purpose locks. Nonuniform communication architectures (NUCAs), for example CCNUMAs built from a f...