Sciweavers

3321 search results - page 439 / 665
» Performance of parallel computations with dynamic processor ...
Sort
View
MICRO
1998
IEEE
129views Hardware» more  MICRO 1998»
15 years 8 months ago
A Bandwidth-efficient Architecture for Media Processing
Media applications are characterized by large amounts of available parallelism, little data reuse, and a high computation to memory access ratio. While these characteristics are p...
Scott Rixner, William J. Dally, Ujval J. Kapasi, B...
ICPPW
2009
IEEE
15 years 11 months ago
Hardware Microkernels for Heterogeneous Manycore Systems
Abstract— The migration away from power-hungry, speculative execution procesors towards manycore architectures is good news for the embedded and real-time systems community. Comm...
Jason Agron, David L. Andrews
HPDC
2007
IEEE
15 years 10 months ago
Feedback-directed thread scheduling with memory considerations
This paper describes a novel approach to generate an optimized schedule to run threads on distributed shared memory (DSM) systems. The approach relies upon a binary instrumentatio...
Fengguang Song, Shirley Moore, Jack Dongarra
IPPS
2006
IEEE
15 years 10 months ago
Empowering a helper cluster through data-width aware instruction selection policies
Narrow values that can be represented by less number of bits than the full machine width occur very frequently in programs. On the other hand, clustering mechanisms enable cost- a...
Osman S. Unsal, Oguz Ergin, Xavier Vera, Antonio G...
VECPAR
2004
Springer
15 years 9 months ago
Automatically Tuned FFTs for BlueGene/L's Double FPU
Abstract. IBM is currently developing the new line of BlueGene/L supercomputers. The top-of-the-line installation is planned to be a 65,536 processors system featuring a peak perfo...
Franz Franchetti, Stefan Kral, Juergen Lorenz, Mar...