Scalability of applications on distributed sharedmemory (DSM) multiprocessors is limited by communication overheads. At some point, using more processors to increase parallelism y...
Khaled Z. Ibrahim, Gregory T. Byrd, Eric Rotenberg
We present an AC1 (logDCFL) algorithm for checking LTL formulas over finite paths, thus establishing that the problem can be efficiently parallelized. Our construction provides a f...
The prediction of the correct secondary structures of large RNAs is one of the unsolved challenges of computational molecular biology. Among the major obstacles is the fact that a...
Amrita Mathuriya, David A. Bader, Christine E. Hei...
Iterative stencil loops (ISLs) are used in many applications and tiling is a well-known technique to localize their computation. When ISLs are tiled across a parallel architecture...
In this work, we propose a new FPGA design flow that combines the CUDA programming model from Nvidia with the state of the art high-level synthesis tool AutoPilot from AutoESL, to...