Sciweavers

71 search results - page 9 / 15
» Improving memory bank-level parallelism in the presence of p...
Sort
View
IPPS
2009
IEEE
15 years 6 months ago
Exploiting DMA to enable non-blocking execution in Decoupled Threaded Architecture
DTA (Decoupled Threaded Architecture) is designed to exploit fine/medium grained Thread Level Parallelism (TLP) by using a distributed hardware scheduling unit and relying on exi...
Roberto Giorgi, Zdravko Popovic, Nikola Puzovic
EUROPAR
2003
Springer
15 years 5 months ago
Compression in Data Caches with Compressible Field Isolation for Recursive Data Structures
We introduce a software/hardware scheme called the Field Array Compression Technique (FACT) which reduces cache misses due to recursive data structures. Using a data layout transfo...
Masamichi Takagi, Kei Hiraki
CORR
2009
Springer
74views Education» more  CORR 2009»
14 years 9 months ago
Parallelizing Deadlock Resolution in Symbolic Synthesis of Distributed Programs
Previous work has shown that there are two major complexity barriers in the synthesis of fault-tolerant distributed programs, namely generation of fault-span, the set of states re...
Fuad Abujarad, Borzoo Bonakdarpour, Sandeep S. Kul...
IEEEPACT
2002
IEEE
15 years 4 months ago
Using the Compiler to Improve Cache Replacement Decisions
Memory performance is increasingly determining microprocessor performance and technology trends are exacerbating this problem. Most architectures use set-associative caches with L...
Zhenlin Wang, Kathryn S. McKinley, Arnold L. Rosen...
HPCA
2008
IEEE
16 years 2 days ago
Runahead Threads to improve SMT performance
In this paper, we propose Runahead Threads (RaT) as a valuable solution for both reducing resource contention and exploiting memory-level parallelism in Simultaneous Multithreaded...
Tanausú Ramírez, Alex Pajuelo, Olive...