Although stencil auto-tuning has shown tremendous potential in effectively utilizing architectural resources, it has hitherto been limited to single kernel instantiations; in addi...
Shoaib Kamil, Cy Chan, Leonid Oliker, John Shalf, ...
Multi-core processors naturally exploit thread-level parallelism (TLP). However, extracting instruction-level parallelism (ILP) from individual applications or threads is still a ...
Technology trends have led to the advent of multi-core chips in the form of both general-purpose chip multiprocessors (CMPs) and embedded multi-processor systems-on-a-chip (MPSoCs...
Middleware for web service orchestration, such as runtime engines for executing business processes, workflows, or web service compositions, can easily become performance bottleneck...
A parallel version of the plane sweep algorithm targeted towards the small number of processing cores available on commonly available multi-core systems is presented. Experimental...