Sciweavers

5562 search results - page 468 / 1113
» Implementing Parallel Cell-DEVS
Sort
View
IPPS
2010
IEEE
15 years 4 months ago
Optimization of linked list prefix computations on multithreaded GPUs using CUDA
We present a number of optimization techniques to compute prefix sums on linked lists and implement them on multithreaded GPUs using CUDA. Prefix computations on linked structures ...
Zheng Wei, Joseph JáJá
IPPS
2009
IEEE
16 years 1 months ago
High-order stencil computations on multicore clusters
Stencil computation (SC) is of critical importance for broad scientific and engineering applications. However, it is a challenge to optimize complex, highorder SC on emerging clus...
Liu Peng, Richard Seymour, Ken-ichi Nomura, Rajiv ...
SPAA
1997
ACM
15 years 10 months ago
Pipelining with Futures
Pipelining has been used in the design of many PRAM algorithms to reduce their asymptotic running time. Paul, Vishkin, and Wagener (PVW) used the approach in a parallel implementat...
Guy E. Blelloch, Margaret Reid-Miller
ISPDC
2010
IEEE
15 years 4 months ago
Resource-Aware Compiler Prefetching for Many-Cores
—Super-scalar, out-of-order processors that can have tens of read and write requests in the execution window place significant demands on Memory Level Parallelism (MLP). Multi- ...
George C. Caragea, Alexandros Tzannes, Fuat Keceli...
TSE
1998
176views more  TSE 1998»
15 years 6 months ago
Constructive Protocol Specification Using Cicero
—New protocols are often useful, but are hard to implement well. Protocol synthesis is a solution, but synthesized protocols can be slow. Implementing protocols will be even more...
Yen-Min Huang, Chinya V. Ravishankar