An implementation of a parallel ScaLAPACK-style solver for the general Sylvester equation, op(A)X −Xop(B) = C, where op(A) denotes A or its transpose AT , is presented. The paral...
We introduce a 64-bit ANSI/IEEE Std 754-1985 floating point design of a hardware matrix multiplier optimized for FPGA implementations. A general block matrix multiplication algor...
Yong Dou, Stamatis Vassiliadis, Georgi Kuzmanov, G...
This paper examines the scalable parallel implementation of QR factorization of a general matrix, targeting SMP and multi-core architectures. Two implementations of algorithms-by-...
—This paper describes a multi-threaded parallel design and implementation of the Smith-Waterman (SM) algorithm on compute unified device architecture (CUDA)-compatible graphic pr...
The Web abounds with dyadic data that keeps increasing by every single second. Previous work has repeatedly shown the usefulness of extracting the interaction structure inside dya...