Sciweavers

770 search results - page 114 / 154
» Parallel simulation of chip-multiprocessor architectures
Sort
View
ISSS
1998
IEEE
124views Hardware» more  ISSS 1998»
15 years 6 months ago
Data-Path Synthesis of VLIW Video Signal Processors
This paper describes a methodology for synthesizing the data-path of a Very Long Instruction Word (VLIW) based Video Signal Processor (VSP). Offering both performance and programm...
Zhao Wu, Wayne Wolf
115
Voted
ISCA
2011
IEEE
229views Hardware» more  ISCA 2011»
14 years 5 months ago
TLSync: support for multiple fast barriers using on-chip transmission lines
As the number of cores on a single-chip grows, scalable barrier synchronization becomes increasingly difficult to implement. In software implementations, such as the tournament ba...
Jungju Oh, Milos Prvulovic, Alenka G. Zajic
99
Voted
JSSPP
2004
Springer
15 years 7 months ago
Reconfigurable Gang Scheduling Algorithm
 Using a single traditional gang scheduling algorithm cannot provide the best performance for all workloads and parallel architectures. A solution for this problem is the use of...
Luís Fabrício Wanderley Góes,...
IPPS
2000
IEEE
15 years 6 months ago
Reducing Ownership Overhead for Load-Store Sequences in Cache-Coherent Multiprocessors
Parallel programs that modify shared data in a cachecoherent multiprocessor with a write-invalidate coherence protocol create ownership overhead in the form of ownership acquisiti...
Jim Nilsson, Fredrik Dahlgren
PODC
1994
ACM
15 years 6 months ago
A Performance Evaluation of Lock-Free Synchronization Protocols
In this paper, we investigate the practical performance of lock-free techniques that provide synchronization on shared-memory multiprocessors. Our goal is to provide a technique t...
Anthony LaMarca