Sciweavers

4198 search results - page 531 / 840
» Data Parallel Program Design
Sort
View
IPPS
1998
IEEE
15 years 8 months ago
Code Transformations for Low Power Caching in Embedded Multimedia Processors
In this paper, we present several novel strategies to improve software controlled cache utilization, so as to achieve lower power requirements for multi-media and signal processin...
Chidamber Kulkarni, Francky Catthoor, Hugo De Man
IEEEPACT
2008
IEEE
15 years 10 months ago
Exploiting loop-dependent stream reuse for stream processors
The memory access limits the performance of stream processors. By exploiting the reuse of data held in the Stream Register File (SRF), an on-chip storage, the number of memory acc...
Xuejun Yang, Ying Zhang, Jingling Xue, Ian Rogers,...
SBACPAD
2008
IEEE
249views Hardware» more  SBACPAD 2008»
15 years 10 months ago
Processing Neocognitron of Face Recognition on High Performance Environment Based on GPU with CUDA Architecture
This work presents an implementation of Neocognitron Neural Network, using a high performance computing architecture based on GPU (Graphics Processing Unit). Neocognitron is an ar...
Gustavo Poli, José Hiroki Saito, Joã...
HPDC
2007
IEEE
15 years 10 months ago
Feedback-directed thread scheduling with memory considerations
This paper describes a novel approach to generate an optimized schedule to run threads on distributed shared memory (DSM) systems. The approach relies upon a binary instrumentatio...
Fengguang Song, Shirley Moore, Jack Dongarra
IPPS
2006
IEEE
15 years 10 months ago
Speeding up NGB with distributed file streaming framework
Grid computing provides a very rich environment for scientific calculations. In addition to the challenges it provides, it also offers new opportunities for optimization. In this ...
Bingchen Li, Kang Chen, Zhiteng Huang, H. L. Rajic...