Sciweavers

659 search results - page 129 / 132
» Application Specific Processors for Multimedia Applications
Sort
View
123
Voted
EUROPAR
2010
Springer
15 years 24 days ago
Optimized Dense Matrix Multiplication on a Many-Core Architecture
Abstract. Traditional parallel programming methodologies for improving performance assume cache-based parallel systems. However, new architectures, like the IBM Cyclops-64 (C64), b...
Elkin Garcia, Ioannis E. Venetis, Rishi Khan, Guan...
DAC
2010
ACM
14 years 12 months ago
A correlation-based design space exploration methodology for multi-processor systems-on-chip
Given the increasing complexity of multi-processor systems-onchip, a wide range of parameters must be tuned to find the best trade-offs in terms of the selected system figures of ...
Giovanni Mariani, Aleksandar Brankovic, Gianluca P...
ICFP
2010
ACM
14 years 12 months ago
A certified framework for compiling and executing garbage-collected languages
We describe the design, implementation, and use of a machinecertified framework for correct compilation and execution of programs in garbage-collected languages. Our framework ext...
Andrew McCreight, Tim Chevalier, Andrew P. Tolmach
CLUSTER
2006
IEEE
14 years 11 months ago
Performance of parallel communication and spawning primitives on a Linux cluster
The Linux cluster considered in this paper, formed from shuttle box XPC nodes with 2 GHz Athlon processors connected by dual Gb Ethernet switches, is relatively easily constructed...
David J. Johnston, Martin Fleury, Michael Lincoln,...
TPDS
1998
92views more  TPDS 1998»
14 years 11 months ago
An Efficient Algorithm for Row Minima Computations on Basic Reconfigurable Meshes
—A matrix A of size m œ n containing items from a totally ordered universe is termed monotone if, for every i, j, 1 ‹ i < j ‹ m, the minimum value in row j lies below or to...
Koji Nakano, Stephan Olariu