Sciweavers

3686 search results - page 545 / 738
» Pattern-Based Parallel Programming
Sort
View
IEEEPACT
2008
IEEE
15 years 9 months ago
Exploiting loop-dependent stream reuse for stream processors
The memory access limits the performance of stream processors. By exploiting the reuse of data held in the Stream Register File (SRF), an on-chip storage, the number of memory acc...
Xuejun Yang, Ying Zhang, Jingling Xue, Ian Rogers,...
115
Voted
IPPS
2008
IEEE
15 years 9 months ago
Lattice Boltzmann simulation optimization on leading multicore platforms
We present an auto-tuning approach to optimize application performance on emerging multicore architectures. The methodology extends the idea of searchbased performance optimizatio...
Samuel Williams, Jonathan Carter, Leonid Oliker, J...
153
Voted
SBACPAD
2008
IEEE
249views Hardware» more  SBACPAD 2008»
15 years 9 months ago
Processing Neocognitron of Face Recognition on High Performance Environment Based on GPU with CUDA Architecture
This work presents an implementation of Neocognitron Neural Network, using a high performance computing architecture based on GPU (Graphics Processing Unit). Neocognitron is an ar...
Gustavo Poli, José Hiroki Saito, Joã...
113
Voted
HPDC
2007
IEEE
15 years 9 months ago
Feedback-directed thread scheduling with memory considerations
This paper describes a novel approach to generate an optimized schedule to run threads on distributed shared memory (DSM) systems. The approach relies upon a binary instrumentatio...
Fengguang Song, Shirley Moore, Jack Dongarra
112
Voted
ICDCS
2007
IEEE
15 years 9 months ago
A Virtual Node-Based Tracking Algorithm for Mobile Networks
— We introduce a virtual-node based mobile object tracking algorithm for mobile sensor networks, VINESTALK. The algorithm uses the Virtual Stationary Automata programming layer, ...
Tina Nolte, Nancy A. Lynch