Sciweavers

1557 search results - page 183 / 312
» Programming up to Congruence
Sort
View
HPCA
2008
IEEE
15 years 11 months ago
Address-branch correlation: A novel locality for long-latency hard-to-predict branches
Hard-to-predict branches depending on longlatency cache-misses have been recognized as a major performance obstacle for modern microprocessors. With the widening speed gap between...
Hongliang Gao, Yi Ma, Martin Dimitrov, Huiyang Zho...
ASPLOS
2010
ACM
15 years 6 months ago
Conservation cores: reducing the energy of mature computations
Growing transistor counts, limited power budgets, and the breakdown of voltage scaling are currently conspiring to create a utilization wall that limits the fraction of a chip tha...
Ganesh Venkatesh, Jack Sampson, Nathan Goulding, S...
IEEEPACT
2009
IEEE
15 years 6 months ago
SOS: A Software-Oriented Distributed Shared Cache Management Approach for Chip Multiprocessors
Abstract—This paper proposes a new software-oriented approach for managing the distributed shared L2 caches of a chip multiprocessor (CMP) for latency-oriented multithreaded appl...
Lei Jin, Sangyeun Cho
IPPS
2005
IEEE
15 years 4 months ago
Fast Address Translation Techniques for Distributed Shared Memory Compilers
The Distributed Shared Memory (DSM) model is designed to leverage the ease of programming of the shared memory paradigm, while enabling the highperformance by expressing locality ...
François Cantonnet, Tarek A. El-Ghazawi, Pa...
ISLPED
2005
ACM
96views Hardware» more  ISLPED 2005»
15 years 4 months ago
Region-level approximate computation reuse for power reduction in multimedia applications
ABSTRACT Motivated by data value locality and quality tolerance present in multimedia applications, we propose a new micro-architecture, Region-level Approximate Computation Buffer...
Xueqi Cheng, Michael S. Hsiao