Sciweavers

52 search results - page 3 / 11
» Strategies and Implementation for Translating OpenMP Code fo...
Sort
View
HPCA
2009
IEEE
14 years 5 months ago
Design and implementation of software-managed caches for multicores with local memory
Heterogeneous multicores, such as Cell BE processors and GPGPUs, typically do not have caches for their accelerator cores because coherence traffic, cache misses, and latencies fr...
Sangmin Seo, Jaejin Lee, Zehra Sura
AINA
2006
IEEE
13 years 11 months ago
On Optimization and Parallelization of Fuzzy Connected Segmentation for Medical Imaging
Fuzzy Connectedness is an important image segmentation routine for image processing of medical images. It is often used in preparation for surgery and sometimes during surgery. It...
Christopher Gammage, Vipin Chaudhary
JPDC
2006
106views more  JPDC 2006»
13 years 4 months ago
Performance characteristics of the multi-zone NAS parallel benchmarks
We describe a new suite of computational benchmarks that models applications featuring multiple levels of parallelism. Such parallelism is often available in realistic flow comput...
Haoqiang Jin, Rob F. Van der Wijngaart
PAAPP
2002
76views more  PAAPP 2002»
13 years 4 months ago
Performance of PDE solvers on a self-optimizing NUMA architecture
Abstract. The performance of shared-memory (OpenMP) implementations of three different PDE solver kernels representing finite difference methods, finite volume methods, and spectra...
Sverker Holmgren, Markus Nordén, Jarmo Rant...
ICFEM
2009
Springer
13 years 2 months ago
Implementing a Direct Method for Certificate Translation
Abstract. Certificate translation is a method that transforms certificates of source programs into certificates of their compilation. It provides strong guarantees on low-level cod...
Gilles Barthe, Benjamin Grégoire, Sylvain H...