This paper presents a new GPU-based tensor voting implementation which achieves significant performance improvement over the conventional CPU-based implementation. Although the t...
Abstract. The ability to offload functionality to a programmable network interface is appealing, both for increasing message passing performance and for reducing the overhead on t...
This paper investigates the utilization of the master-slave (MS) paradigm as an alternative to domain decomposition (DD) methods for parallelizing lattice gauge theory (LGT) model...
The OpenMP standard was conceived to parallelize dense array-based applications, and it has achieved much success with that. Recently, a novel tasking proposal to handle unstructur...
Reconfigurable systems, and in particular, FPGA-based custom computing machines, offer a unique opportunity to define application-specific architectures. These architectures offer...
Heidi E. Ziegler, Byoungro So, Mary W. Hall, Pedro...