We address the problem of efficient out-of-core code generation for a special class of imperfectly nested loops encoding tensor contractions arising in quantum chemistry computati...
OpenMP is an architecture-independent language for programming in the shared memory model. OpenMP is designed to be simple and in terms of programming abstractions. Unfortunately,...
High-accuracy PDE solvers use multi-dimensional fast Fourier transforms. The FFTs exhibits a static and structured memory access pattern which results in a large amount of communic...
Abstract. A parallel version of the self-verified method for solving linear systems was presented in [19, 18]. In this research we propose improvements aiming at a better performan...
— This article addresses the fast solution of a Quadratic Program underlying a Linear Model Predictive Control scheme that generates walking motions. We introduce an algorithm wh...