Sciweavers

2784 search results - page 387 / 557
» Instruction Level Parallelism
Sort
View
120
Voted
HPCA
2008
IEEE
16 years 3 months ago
Fundamental performance constraints in horizontal fusion of in-order cores
A conceptually appealing approach to supporting a broad range of workloads is a system comprising many small cores that can be fused, on demand, into larger cores. We demonstrate ...
Pierre Salverda, Craig B. Zilles
HPCA
2007
IEEE
16 years 3 months ago
Thermal Herding: Microarchitecture Techniques for Controlling Hotspots in High-Performance 3D-Integrated Processors
3D integration technology greatly increases transistor density while providing faster on-chip communication. 3D implementations of processors can simultaneously provide both laten...
Kiran Puttaswamy, Gabriel H. Loh
HPCA
2004
IEEE
16 years 3 months ago
Accurate and Complexity-Effective Spatial Pattern Prediction
Recent research suggests that there are large variations in a cache's spatial usage, both within and across programs. Unfortunately, conventional caches typically employ fixe...
Chi F. Chen, Se-Hyun Yang, Babak Falsafi, Andreas ...
123
Voted
ASPLOS
2010
ACM
15 years 10 months ago
Flexible architectural support for fine-grain scheduling
To make efficient use of CMPs with tens to hundreds of cores, it is often necessary to exploit fine-grain parallelism. However, managing tasks of a few thousand instructions is ...
Daniel Sanchez, Richard M. Yoo, Christos Kozyrakis
ASPLOS
2010
ACM
15 years 10 months ago
MacroSS: macro-SIMDization of streaming applications
SIMD (Single Instruction, Multiple Data) engines are an essential part of the processors in various computing markets, from servers to the embedded domain. Although SIMD-enabled a...
Amir Hormati, Yoonseo Choi, Mark Woh, Manjunath Ku...