Sciweavers

119 search results - page 16 / 24
» Program Optimization in the Domain of High-Performance Paral...
Sort
View
SC
2000
ACM
15 years 1 months ago
Extending OpenMP for NUMA Machines
This paper describes extensions to OpenMP that implement data placement features needed for NUMA architectures. OpenMP is a collection of compiler directives and library routines ...
John Bircsak, Peter Craig, RaeLyn Crowell, Zarka C...
ICPP
2008
IEEE
15 years 4 months ago
Optimizing JPEG2000 Still Image Encoding on the Cell Broadband Engine
JPEG2000 is the latest still image coding standard from the JPEG committee, which adopts new algorithms such as Embedded Block Coding with Optimized Truncation (EBCOT) and Discret...
Seunghwa Kang, David A. Bader
IPPS
2005
IEEE
15 years 3 months ago
Automatic Support for Irregular Computations in a High-Level Language
The problem of writing high performance parallel applications becomes even more challenging when irregular, sparse or adaptive methods are employed. In this paper we introduce com...
Jimmy Su, Katherine A. Yelick
ASPLOS
2010
ACM
15 years 2 months ago
Modeling GPU-CPU workloads and systems
Heterogeneous systems, systems with multiple processors tailored for specialized tasks, are challenging programming environments. While it may be possible for domain experts to op...
Andrew Kerr, Gregory F. Diamos, Sudhakar Yalamanch...
CASES
2007
ACM
15 years 1 months ago
Application driven embedded system design: a face recognition case study
The key to increasing performance without a commensurate increase in power consumption in modern processors lies in increasing both parallelism and core specialization. Core speci...
Karthik Ramani, Al Davis