Sciweavers

2784 search results - page 237 / 557
» Instruction Level Parallelism
Sort
View
IPPS
2008
IEEE
15 years 10 months ago
DC-SIMD : Dynamic communication for SIMD processors
SIMD (single instruction multiple data)-type processors have been found very efficient in image processing applications, because their repetitive structure is able to exploit the...
Raymond Frijns, Hamed Fatemi, Bart Mesman, Henk Co...
108
Voted
MICRO
2003
IEEE
96views Hardware» more  MICRO 2003»
15 years 9 months ago
Using Interaction Costs for Microarchitectural Bottleneck Analysis
Attacking bottlenecks in modern processors is difficult because many microarchitectural events overlap with each other. This parallelism makes it difficult to both (a) assign a ...
Brian A. Fields, Rastislav Bodík, Mark D. H...
141
Voted
IEEEPACT
2002
IEEE
15 years 9 months ago
Compiler-Controlled Caching in Superword Register Files for Multimedia Extension Architectures
In this paper, we describe an algorithm and implementation of locality optimizations for architectures with instruction sets such as Intel’s SSE and Motorola’s AltiVec that su...
Jaewook Shin, Jacqueline Chame, Mary W. Hall
HPCA
2000
IEEE
15 years 8 months ago
Improving the Throughput of Synchronization by Insertion of Delays
Efficiency of synchronization mechanisms can limit the parallel performance of many shared-memory applications. In addition, the ever increasing performance gap between processor...
Ravi Rajwar, Alain Kägi, James R. Goodman
HPCA
1998
IEEE
15 years 8 months ago
Virtual-Physical Registers
A novel dynamic register renaming approach is proposed in this work. The key idea of the novel scheme is to delay the allocation of physical registers until a late stage in the pi...
Antonio González, José Gonzál...