Sciweavers

45 search results - page 2 / 9
» Cache Performance Optimizations for Parallel Lattice Boltzma...
Sort
View
ISCAPDCS
2007
13 years 7 months ago
Evaluation of architectural support for speech codecs application in large-scale parallel machines
— Next generation multimedia mobile phones that use the high bandwidth 3G cellular radio network consume more power. Multimedia algorithms such as speech, video transcodecs have ...
Naeem Zafar Azeemi
ICCAD
2005
IEEE
131views Hardware» more  ICCAD 2005»
14 years 2 months ago
Code restructuring for improving cache performance of MPSoCs
— One of the critical goals in code optimization for MPSoC architectures is to minimize the number of off-chip memory accesses. This is because such accesses can be extremely cos...
Guilin Chen, Mahmut T. Kandemir
PLDI
1995
ACM
13 years 9 months ago
Improving Balanced Scheduling with Compiler Optimizations that Increase Instruction-Level Parallelism
Traditional list schedulers order instructions based on an optimistic estimate of the load latency imposed by the hardware and therefore cannot respond to variations in memory lat...
Jack L. Lo, Susan J. Eggers
ICS
2007
Tsinghua U.
13 years 11 months ago
Performance driven data cache prefetching in a dynamic software optimization system
Software or hardware data cache prefetching is an efficient way to hide cache miss latency. However effectiveness of the issued prefetches have to be monitored in order to maximi...
Jean Christophe Beyler, Philippe Clauss
CONCURRENCY
2006
140views more  CONCURRENCY 2006»
13 years 5 months ago
An efficient memory operations optimization technique for vector loops on Itanium 2 processors
To keep up with a large degree of instruction level parallelism (ILP), the Itanium 2 cache systems use a complex organization scheme: load/store queues, banking and interleaving. ...
William Jalby, Christophe Lemuet, Sid Ahmed Ali To...