Sciweavers

14 search results - page 3 / 3
» Tradeoff between data-, instruction-, and thread-level paral...
Sort
View
ASPLOS
2010
ACM
14 years 5 days ago
MacroSS: macro-SIMDization of streaming applications
SIMD (Single Instruction, Multiple Data) engines are an essential part of the processors in various computing markets, from servers to the embedded domain. Although SIMD-enabled a...
Amir Hormati, Yoonseo Choi, Mark Woh, Manjunath Ku...
JSA
2000
175views more  JSA 2000»
13 years 5 months ago
Complete worst-case execution time analysis of straight-line hard real-time programs
In this article, the problem of finding a tight estimate on the worst-case execution time (WCET) of a real-time program is addressed. The analysis is focused on straight-line code...
Friedhelm Stappert, Peter Altenbernd
EUROPAR
2010
Springer
13 years 5 months ago
Optimized On-Chip-Pipelined Mergesort on the Cell/B.E
Abstract. Limited bandwidth to off-chip main memory is a performance bottleneck in chip multiprocessors for streaming computations, such as Cell/B.E., and this will become even mor...
Rikard Hultén, Christoph W. Kessler, Jö...
HPCA
2009
IEEE
14 years 5 months ago
Design and implementation of software-managed caches for multicores with local memory
Heterogeneous multicores, such as Cell BE processors and GPGPUs, typically do not have caches for their accelerator cores because coherence traffic, cache misses, and latencies fr...
Sangmin Seo, Jaejin Lee, Zehra Sura