Sciweavers

4198 search results - page 402 / 840
» Data Parallel Program Design
Sort
View
MICRO
1998
IEEE
129views Hardware» more  MICRO 1998»
15 years 9 months ago
A Bandwidth-efficient Architecture for Media Processing
Media applications are characterized by large amounts of available parallelism, little data reuse, and a high computation to memory access ratio. While these characteristics are p...
Scott Rixner, William J. Dally, Ujval J. Kapasi, B...
SIGSOFT
2003
ACM
15 years 10 months ago
Refinements and multi-dimensional separation of concerns
1 Step-wise refinement (SWR) asserts that complex programs can be derived from simple programs by progressively adding features. The length of a program specification is the number...
Don S. Batory, Jia Liu, Jacob Neal Sarvela
CLUSTER
2009
IEEE
15 years 11 months ago
Message passing for GPGPU clusters: CudaMPI
—We present and analyze two new communication libraries, cudaMPI and glMPI, that provide an MPI-like message passing interface to communicate data stored on the graphics cards of...
Orion S. Lawlor
ICCD
2008
IEEE
124views Hardware» more  ICCD 2008»
16 years 1 months ago
Global bus route optimization with application to microarchitectural design exploration
— Circuit and processor designs will continue to increase in complexity for the foreseeable future. With these increasing sizes comes the use of wide buses to move large amounts ...
Dae Hyun Kim, Sung Kyu Lim
JSS
2007
120views more  JSS 2007»
15 years 4 months ago
The design and evaluation of path matching schemes on compressed control flow traces
A control flow trace captures the complete sequence of dynamically executed basic blocks and function calls. It is usually of very large size and therefore commonly stored in com...
Yongjing Lin, Youtao Zhang, Rajiv Gupta