Sciweavers

1990 search results - page 89 / 398
» Optimizing the Instruction Cache Performance of the Operatin...
Sort
View
WDAG
2007
Springer
86views Algorithms» more  WDAG 2007»
15 years 10 months ago
Cost-Aware Caching Algorithms for Distributed Storage Servers
We study replacement algorithms for non-uniform access caches that are used in distributed storage systems. Considering access latencies as major costs of data management in such a...
Shuang Liang, Ke Chen, Song Jiang, Xiaodong Zhang
CC
2008
Springer
144views System Software» more  CC 2008»
15 years 6 months ago
Control Flow Emulation on Tiled SIMD Architectures
Heterogeneous multi-core and streaming architectures such as the GPU, Cell, ClearSpeed, and Imagine processors have better power/ performance ratios and memory bandwidth than tradi...
Ghulam Lashari, Ondrej Lhoták, Michael McCo...
105
Voted
DSD
2003
IEEE
69views Hardware» more  DSD 2003»
15 years 9 months ago
A VLIW Architecture for Logarithmic Arithmetic
The Logarithmic Number System (LNS) is an alternative to IEEE-754 standard floating-point arithmetic. LNS multiply, divide and square root are easier than IEEE-754 and naturally ...
Mark G. Arnold
ICS
2007
Tsinghua U.
15 years 10 months ago
Optimization of data prefetch helper threads with path-expression based statistical modeling
This paper investigates helper threads that improve performance by prefetching data on behalf of an application’s main thread. The focus is data prefetch helper threads that lac...
Tor M. Aamodt, Paul Chow
DAC
2004
ACM
16 years 5 months ago
Multi-profile based code compression
Code compression has been shown to be an effective technique to reduce code size in memory constrained embedded systems. It has also been used as a way to increase cache hit ratio...
Eduardo Wanderley Netto, Rodolfo Azevedo, Paulo Ce...