Sciweavers

622 search results - page 51 / 125
» Comparing the Optimal Performance of Multiprocessor Architec...
Sort
View
IPPS
2008
IEEE
15 years 4 months ago
Lattice Boltzmann simulation optimization on leading multicore platforms
We present an auto-tuning approach to optimize application performance on emerging multicore architectures. The methodology extends the idea of searchbased performance optimizatio...
Samuel Williams, Jonathan Carter, Leonid Oliker, J...
ICCD
2005
IEEE
134views Hardware» more  ICCD 2005»
15 years 6 months ago
Architectural Considerations for Energy Efficiency
The formal analysis of parallelism and pipelining is performed on an 8-bit Add-Compare-Select element of a Viterbi decoder. The results are quantified through a study of the delay...
Hoang Q. Dao, Bart R. Zeydel, Vojin G. Oklobdzija
ICIP
2009
IEEE
15 years 10 months ago
Architecture Design Of A High-performance Dual-symbol Binary Arithmetic Coder For Jpeg2000
The embedded-block coding with optimized truncation (EBCOT), which consists of a bit-plane coder (BPC) and a binary arithmetic coder (BAC), is the bottleneck in realizing a high-p...
ISCA
2011
IEEE
238views Hardware» more  ISCA 2011»
14 years 1 months ago
Rebound: scalable checkpointing for coherent shared memory
As we move to large manycores, the hardware-based global checkpointing schemes that have been proposed for small shared-memory machines do not scale. Scalability barriers include ...
Rishi Agarwal, Pranav Garg, Josep Torrellas
EUROPAR
2010
Springer
14 years 10 months ago
Optimized On-Chip-Pipelined Mergesort on the Cell/B.E
Abstract. Limited bandwidth to off-chip main memory is a performance bottleneck in chip multiprocessors for streaming computations, such as Cell/B.E., and this will become even mor...
Rikard Hultén, Christoph W. Kessler, Jö...