Message-Passing Interface (MPI) has become a standard for parallel applications in high-performance computing. Within a single cluster node, MPI implementations benefit from the s...
Control algorithms implemented directly in hardware take advantage of parallel signal processing. Furthermore, implementing controller functionality in reconfigurable hardware fac...
We study the problem of exploiting parallelism from search-based AI systems on distributed machines. We propose stack-splitting, a technique for implementing orparallelism, which ...
We believe that future many-core architectures should support a simple and scalable way to execute many threads that are generated by parallel programs. A good candidate to impleme...
The characteristics of irregular algorithms make a parallel implementation difficult, especially for PC clusters or clusters of SMPs. These characteristics may include an unpredi...