Sciweavers

3686 search results - page 545 / 738
» Pattern-Based Parallel Programming
Sort
View
IEEEPACT
2008
IEEE
15 years 11 months ago
Exploiting loop-dependent stream reuse for stream processors
The memory access limits the performance of stream processors. By exploiting the reuse of data held in the Stream Register File (SRF), an on-chip storage, the number of memory acc...
Xuejun Yang, Ying Zhang, Jingling Xue, Ian Rogers,...
IPPS
2008
IEEE
15 years 11 months ago
Lattice Boltzmann simulation optimization on leading multicore platforms
We present an auto-tuning approach to optimize application performance on emerging multicore architectures. The methodology extends the idea of searchbased performance optimizatio...
Samuel Williams, Jonathan Carter, Leonid Oliker, J...
SBACPAD
2008
IEEE
249views Hardware» more  SBACPAD 2008»
15 years 11 months ago
Processing Neocognitron of Face Recognition on High Performance Environment Based on GPU with CUDA Architecture
This work presents an implementation of Neocognitron Neural Network, using a high performance computing architecture based on GPU (Graphics Processing Unit). Neocognitron is an ar...
Gustavo Poli, José Hiroki Saito, Joã...
HPDC
2007
IEEE
15 years 11 months ago
Feedback-directed thread scheduling with memory considerations
This paper describes a novel approach to generate an optimized schedule to run threads on distributed shared memory (DSM) systems. The approach relies upon a binary instrumentatio...
Fengguang Song, Shirley Moore, Jack Dongarra
ICDCS
2007
IEEE
15 years 11 months ago
A Virtual Node-Based Tracking Algorithm for Mobile Networks
— We introduce a virtual-node based mobile object tracking algorithm for mobile sensor networks, VINESTALK. The algorithm uses the Virtual Stationary Automata programming layer, ...
Tina Nolte, Nancy A. Lynch