Sciweavers

628 search results - page 112 / 126
» Tying Memory Management to Parallel Programming Models
Sort
View
IPPS
2010
IEEE
14 years 8 months ago
Inter-block GPU communication via fast barrier synchronization
The graphics processing unit (GPU) has evolved from a fixedfunction processor with programmable stages to a programmable processor with many fixed-function components that deliver...
Shucai Xiao, Wu-chun Feng
MICRO
2010
IEEE
156views Hardware» more  MICRO 2010»
14 years 8 months ago
Explicit Communication and Synchronization in SARC
SARC merges cache controller and network interface functions by relying on a single hardware primitive: each access checks the tag and the state of the addressed line for possible...
Manolis Katevenis, Vassilis Papaefstathiou, Stamat...
MEMOCODE
2007
IEEE
15 years 4 months ago
Scheduling as Rule Composition
Bluespec is a high-level hardware description language used for architectural exploration, hardware modeling and synthesis of semiconductor chips. In Bluespec, one views hardware ...
Nirav Dave, Arvind, Michael Pellauer
PLDI
1996
ACM
15 years 2 months ago
A Reduced Multipipeline Machine Description that Preserves Scheduling Constraints
High performance compilers increasingly rely on accurate modeling of the machine resources to efficiently exploit the instruction level parallelism of an application. In this pape...
Alexandre E. Eichenberger, Edward S. Davidson
JPDC
2008
167views more  JPDC 2008»
14 years 10 months ago
A performance study of general-purpose applications on graphics processors using CUDA
Graphics processors (GPUs) provide a vast number of simple, data-parallel, deeply multithreaded cores and high memory bandwidths. GPU architectures are becoming increasingly progr...
Shuai Che, Michael Boyer, Jiayuan Meng, David Tarj...