Sciweavers

2784 search results - page 229 / 557
» Instruction Level Parallelism
Sort
View
306
Voted
ICFP
2012
ACM
13 years 6 months ago
Nested data-parallelism on the gpu
Graphics processing units (GPUs) provide both memory bandwidth and arithmetic performance far greater than that available on CPUs but, because of their Single-Instruction-Multiple...
Lars Bergstrom, John H. Reppy
SPAA
2010
ACM
15 years 9 months ago
Implementing and evaluating nested parallel transactions in software transactional memory
Transactional Memory (TM) is a promising technique that simplifies parallel programming for shared-memory applications. To date, most TM systems have been designed to efficientl...
Woongki Baek, Nathan Grasso Bronson, Christos Kozy...
162
Voted
JSA
2010
158views more  JSA 2010»
14 years 11 months ago
Scalable mpNoC for massively parallel systems - Design and implementation on FPGA
The high chip-level integration enables the implementation of large-scale parallel processing architectures with 64 and more processing nodes on a single chip or on an FPGA device...
Mouna Baklouti, Yassine Aydi, Philippe Marquet, Je...
147
Voted
IPPS
1997
IEEE
15 years 8 months ago
A Tool for On-line Visualization and Interactive Steering of Parallel HPC Applications
Tools for parallel systems today range from specification over debugging to performance analysis and more. Typically, they help the programmers of parallel algorithms from the ea...
Sabine Rathmayer
ICFP
2008
ACM
16 years 4 months ago
A scheduling framework for general-purpose parallel languages
The trend in microprocessor design toward multicore and manycore processors means that future performance gains in software will largely come from harnessing parallelism. To reali...
Matthew Fluet, Mike Rainey, John H. Reppy