Sciweavers

4198 search results - page 729 / 840
» Data Parallel Program Design
Sort
View
111
Voted
DAC
2002
ACM
16 years 3 months ago
Dynamic hardware plugins in an FPGA with partial run-time reconfiguration
Tools and a design methodology have been developed to support partial run-time reconfiguration of FPGA logic on the Field Programmable Port Extender. High-speed Internet packet pr...
Edson L. Horta, John W. Lockwood, David E. Taylor,...
HPCA
2005
IEEE
16 years 2 months ago
Using Virtual Load/Store Queues (VLSQs) to Reduce the Negative Effects of Reordered Memory Instructions
The use of large instruction windows coupled with aggressive out-oforder and prefetching capabilities has provided significant improvements in processor performance. In this paper...
Aamer Jaleel, Bruce L. Jacob
MICRO
2000
IEEE
176views Hardware» more  MICRO 2000»
15 years 2 months ago
An Advanced Optimizer for the IA-64 Architecture
level of abstraction, compared with the program representation for scalar optimizations. For example, loop unrolling and loop unrolland-jam transformations exploit the large regist...
Rakesh Krishnaiyer, Dattatraya Kulkarni, Daniel M....
ISLPED
2005
ACM
96views Hardware» more  ISLPED 2005»
15 years 8 months ago
Region-level approximate computation reuse for power reduction in multimedia applications
ABSTRACT Motivated by data value locality and quality tolerance present in multimedia applications, we propose a new micro-architecture, Region-level Approximate Computation Buffer...
Xueqi Cheng, Michael S. Hsiao
CASES
2003
ACM
15 years 7 months ago
Exploiting bank locality in multi-bank memories
Bank locality can be defined as localizing the number of load/store accesses to a small set of memory banks at a given time. An optimizing compiler can modify a given input code t...
Guilin Chen, Mahmut T. Kandemir, Hendra Saputra, M...