Sciweavers

66 search results - page 2 / 14
» Thread Cluster Memory Scheduling: Exploiting Differences in ...
Sort
View
IPPS
2009
IEEE
13 years 11 months ago
Exploiting DMA to enable non-blocking execution in Decoupled Threaded Architecture
DTA (Decoupled Threaded Architecture) is designed to exploit fine/medium grained Thread Level Parallelism (TLP) by using a distributed hardware scheduling unit and relying on exi...
Roberto Giorgi, Zdravko Popovic, Nikola Puzovic
HPCA
2007
IEEE
14 years 5 months ago
A Burst Scheduling Access Reordering Mechanism
Utilizing the nonuniform latencies of SDRAM devices, access reordering mechanisms alter the sequence of main memory access streams to reduce the observed access latency. Using a r...
Jun Shao, Brian T. Davis
ISCA
2000
IEEE
121views Hardware» more  ISCA 2000»
13 years 9 months ago
Memory access scheduling
The bandwidth and latency of a memory system are strongly dependent on the manner in which accesses interact with the “3-D” structure of banks, rows, and columns characteristi...
Scott Rixner, William J. Dally, Ujval J. Kapasi, P...
SAC
2008
ACM
13 years 4 months ago
Exploiting program cyclic behavior to reduce memory latency in embedded processors
In this work we modify the conventional row buffer allocation mechanism used in DDR2 SDRAM banks to improve average memory latency and overall processor performance. Our method as...
Ehsan Atoofian, Amirali Baniasadi
HOTOS
2009
IEEE
13 years 8 months ago
Reinventing Scheduling for Multicore Systems
High performance on multicore processors requires that schedulers be reinvented. Traditional schedulers focus on keeping execution units busy by assigning each core a thread to ru...
Silas Boyd-Wickizer, Robert Morris, M. Frans Kaash...