As multi/many core processors become prevalent, programming language is important in constructing efficient parallel applications. In this work, we build a multithreaded video min...
Wenlong Li, Eric Li, Ran Meng, Tao Wang, Carole Du...
Exploiting parallelism at both the multiprocessor level and the instruction level is an e ective means for supercomputers to achieve high-performance. The amount of instruction-le...
Scott A. Mahlke, William Y. Chen, John C. Gyllenha...
In this paper, we present three parallel flexible approximate string matching methods on a parallel architecture with heterogeneous workstations to gain supercomputer power at lo...
Panagiotis D. Michailidis, Konstantinos G. Margari...
We present a new fast and scalable matrix multiplication algorithm, called DIMMA Distribution-Independent Matrix Multiplication Algorithm, for block cyclic data distribution on ...
Dynamic Thermal Management (DTM) techniques have been proposed to save on thermal packaging and cooling costs for generalpurpose processors. However, when invoked, these technique...