Sciweavers

338 search results - page 66 / 68
» Automated Performance Prediction of Message-Passing Parallel...
Sort
View
CODES
2005
IEEE
15 years 5 months ago
High-level synthesis for large bit-width multipliers on FPGAs: a case study
In this paper, we present the analysis, design and implementation of an estimator to realize large bit width unsigned integer multiplier units. Larger multiplier units are require...
Gang Quan, James P. Davis, Siddhaveerasharan Devar...
ASPLOS
2009
ACM
16 years 10 days ago
StreamRay: a stream filtering architecture for coherent ray tracing
The wide availability of commodity graphics processors has made real-time graphics an intrinsic component of the human/computer interface. These graphics cores accelerate the z-bu...
Karthik Ramani, Christiaan P. Gribble, Al Davis
OOPSLA
2005
Springer
15 years 5 months ago
X10: an object-oriented approach to non-uniform cluster computing
It is now well established that the device scaling predicted by Moore’s Law is no longer a viable option for increasing the clock frequency of future uniprocessor systems at the...
Philippe Charles, Christian Grothoff, Vijay A. Sar...
IEEEPACT
2000
IEEE
15 years 4 months ago
A Lightweight Algorithm for Dynamic If-Conversion during Dynamic Optimization
Dynamic Optimization is an umbrella term that refers to any optimization of software that is performed after the initial compile time. It is a complementary optimization opportuni...
Kim M. Hazelwood, Thomas M. Conte
ICS
1999
Tsinghua U.
15 years 4 months ago
Software trace cache
—This paper explores the use of compiler optimizations which optimize the layout of instructions in memory. The target is to enable the code to make better use of the underlying ...
Alex Ramírez, Josep-Lluis Larriba-Pey, Carl...