Sciweavers

HPCA
2012
IEEE
11 years 11 months ago
Power balanced pipelines
Since the onset of pipelined processors, balancing the delay of the microarchitectural pipeline stages such that each microarchitectural pipeline stage has an equal delay has been...
John Sartori, Ben Ahrens, Rakesh Kumar
HPCA
2012
IEEE
11 years 11 months ago
System-level implications of disaggregated memory
Recent research on memory disaggregation introduces a new architectural building block—the memory blade—as a cost-effective approach for memory capacity expansion and sharing ...
Kevin T. Lim, Yoshio Turner, Jose Renato Santos, A...
HPCA
2012
IEEE
11 years 11 months ago
Decoupled dynamic cache segmentation
The least recently used (LRU) replacement policy performs poorly in the last-level cache (LLC) because temporal locality of memory accesses is filtered by first and second level...
Samira Manabi Khan, Zhe Wang, Daniel A. Jimé...
HPCA
2012
IEEE
11 years 11 months ago
Pacman: Tolerating asymmetric data races with unintrusive hardware
Data races are a major contributor to parallel software unreliability. A type of race that is both common and typically harmful is the Asymmetric data race. It occurs when at leas...
Shanxiang Qi, Norimasa Otsuki, Lois Orosa Nogueira...
HPCA
2012
IEEE
11 years 11 months ago
SCD: A scalable coherence directory with flexible sharer set encoding
Large-scale CMPs with hundreds of cores require a directory-based protocol to maintain cache coherence. However, previously proposed coherence directories are hard to scale beyond...
Daniel Sanchez, Christos Kozyrakis
HPCA
2012
IEEE
11 years 11 months ago
Staged Reads: Mitigating the impact of DRAM writes on DRAM reads
Main memory latencies have always been a concern for system performance. Given that reads are on the critical path for CPU progress, reads must be prioritized over writes. However...
Niladrish Chatterjee, Naveen Muralimanohar, Rajeev...
HPCA
2012
IEEE
11 years 11 months ago
Flexible register management using reference counting
Conventional out-of-order processors that use a unified physical register file allocate and reclaim registers explicitly using a free list that operates as a circular queue. We ...
Steven Battle, Andrew D. Hilton, Mark Hempstead, A...
HPCA
2012
IEEE
11 years 11 months ago
BulkSMT: Designing SMT processors for atomic-block execution
Multiprocessor architectures that continuously execute atomic blocks (or chunks) of instructions can improve performance and software productivity. However, all of the prior propo...
Xuehai Qian, Benjamin Sahelices, Josep Torrellas
HPCA
2012
IEEE
11 years 11 months ago
Balancing DRAM locality and parallelism in shared memory CMP systems
Modern memory systems rely on spatial locality to provide high bandwidth while minimizing memory device power and cost. The trend of increasing the number of cores that share memo...
Min Kyu Jeong, Doe Hyun Yoon, Dam Sunwoo, Mike Sul...
HPCA
2012
IEEE
11 years 11 months ago
Booster: Reactive core acceleration for mitigating the effects of process variation and application imbalance in low-voltage chi
Lowering supply voltage is one of the most effective techniques for reducing microprocessor power consumption. Unfortunately, at low voltages, chips are very sensitive to process ...
Timothy N. Miller, Xiang Pan, Renji Thomas, Naser ...