Multithreading is an important software modularization technique. However, it can incur substantial overheads, especially in processors where the amount of architecturally visible...
It has been shown that FPGAs could outperform high-end microprocessors on floating-point computations thanks to massive parallelism. However, most previous studies re-implement in...
This paper describes a proof-of-concept implementation of a basic autonomous computing system. The system consists of an XUP Virtex-II Pro board running Linux and a set of softwar...
In this paper, a thorough bottom-up optimization process (field, point and scalar arithmetic) is used to speed up the computation of elliptic curve point multiplication and report ...
We present a simulation-based performance model to analyze a parallel sparse LU factorization algorithm on modern cached-based, high-end parallel architectures. We consider supern...