Sciweavers

369 search results - page 14 / 74
» Instruction set mapping for performance optimization
Sort
View
CLUSTER
2011
IEEE
13 years 11 months ago
Performance Characterization and Optimization of Atomic Operations on AMD GPUs
—Atomic operations are important building blocks in supporting general-purpose computing on graphics processing units (GPUs). For instance, they can be used to coordinate executi...
Marwa Elteir, Heshan Lin, Wu-chun Feng
CGO
2003
IEEE
15 years 5 months ago
Dynamic Binary Translation for Accumulator-Oriented Architectures
A dynamic binary translation system for a co-designed virtual machine is described and evaluated. The underlying hardware directly executes an accumulator-oriented instruction set...
Ho-Seop Kim, James E. Smith
ICCD
2005
IEEE
246views Hardware» more  ICCD 2005»
15 years 8 months ago
H-SIMD Machine: Configurable Parallel Computing for Matrix Multiplication
FPGAs (Field-Programmable Gate Arrays) are often used as coprocessors to boost the performance of dataintensive applications [1, 2]. However, mapping algorithms onto multimillion-...
Xizhen Xu, Sotirios G. Ziavras
IPPS
2010
IEEE
14 years 9 months ago
Performance and energy optimization of concurrent pipelined applications
In this paper, we study the problem of finding optimal mappings for several independent but concurrent workflow applications, in order to optimize performance-related criteria tog...
Anne Benoit, Paul Renaud-Goud, Yves Robert
ISCAPDCS
2007
15 years 1 months ago
Evaluation of architectural support for speech codecs application in large-scale parallel machines
— Next generation multimedia mobile phones that use the high bandwidth 3G cellular radio network consume more power. Multimedia algorithms such as speech, video transcodecs have ...
Naeem Zafar Azeemi