Sciweavers

234 search results - page 33 / 47
» Architecture Implementation Using the Machine Description La...
Sort
View
ACMMSP
2004
ACM
92views Hardware» more  ACMMSP 2004»
15 years 2 months ago
Instruction combining for coalescing memory accesses using global code motion
Instruction combining is an optimization to replace a sequence of instructions with a more efficient instruction yielding the same result in a fewer machine cycles. When we use it...
Motohiro Kawahito, Hideaki Komatsu, Toshio Nakatan...
ISCA
1998
IEEE
118views Hardware» more  ISCA 1998»
15 years 1 months ago
Active Messages: A Mechanism for Integrated Communication and Computation
The design challenge for large-scale multiprocessors is (1) to minimize communication overhead, (2) allow communication to overlap computation, and (3) coordinate the two without ...
Thorsten von Eicken, David E. Culler, Seth Copen G...
OOPSLA
2010
Springer
14 years 8 months ago
Hera-JVM: a runtime system for heterogeneous multi-core architectures
Heterogeneous multi-core processors, such as the IBM Cell processor, can deliver high performance. However, these processors are notoriously difficult to program: different cores...
Ross McIlroy, Joe Sventek
TPHOL
2009
IEEE
15 years 4 months ago
Practical Tactics for Separation Logic
Abstract. We present a comprehensive set of tactics that make it practical to use separation logic in a proof assistant. These tactics enable the verification of partial correctne...
Andrew McCreight
ASPLOS
1991
ACM
15 years 1 months ago
LimitLESS Directories: A Scalable Cache Coherence Scheme
Caches enhance the performance of multiprocessors by reducing network trac and average memory access latency. However, cache-based systems must address the problem of cache coher...
David Chaiken, John Kubiatowicz, Anant Agarwal