Sciweavers

369 search results - page 44 / 74
» Instruction set mapping for performance optimization
Sort
View
HPCA
2000
IEEE
15 years 6 months ago
Improving the Throughput of Synchronization by Insertion of Delays
Efficiency of synchronization mechanisms can limit the parallel performance of many shared-memory applications. In addition, the ever increasing performance gap between processor...
Ravi Rajwar, Alain Kägi, James R. Goodman
JSAC
2006
120views more  JSAC 2006»
15 years 1 months ago
A Tutorial on Cross-Layer Optimization in Wireless Networks
This tutorial paper overviews recent developments in optimization-based approaches for resource allocation problems in wireless systems. We begin by overviewing important results i...
Xiaojun Lin, Ness B. Shroff, R. Srikant
ASPDAC
2006
ACM
140views Hardware» more  ASPDAC 2006»
15 years 8 months ago
A 52mW 1200MIPS compact DSP for multi-core media SoC
- This paper presents a DSP core for multi-core media SoC, which is optimized to execute a set of signal processing tasks very efficiently. The fully-programmable core has a data-c...
Shih-Hao Ou, Tay-Jyi Lin, Chao-Wei Huang, Yu-Ting ...
EUROPAR
2010
Springer
15 years 2 months ago
Optimized On-Chip-Pipelined Mergesort on the Cell/B.E
Abstract. Limited bandwidth to off-chip main memory is a performance bottleneck in chip multiprocessors for streaming computations, such as Cell/B.E., and this will become even mor...
Rikard Hultén, Christoph W. Kessler, Jö...
EUROPAR
2010
Springer
15 years 3 months ago
Optimized Dense Matrix Multiplication on a Many-Core Architecture
Abstract. Traditional parallel programming methodologies for improving performance assume cache-based parallel systems. However, new architectures, like the IBM Cyclops-64 (C64), b...
Elkin Garcia, Ioannis E. Venetis, Rishi Khan, Guan...