In this paper, we propose two FPGA-area allocation algorithms based on profiling results for reducing the impact on performance of dynamic reconfiguration overheads. The problem o...
Elena Moscu Panainte, Koen Bertels, Stamatis Vassi...
This paper presents a performance analysis of an accelerated 2-D rigid image registration implementation that employs the Compute Unified Device Architecture (CUDA) programming e...
We set out an object localization scheme based on a convex programming matching method. The proposed approach is designed to match general objects, especially objects with very li...
Nested-parallelism programming models, where the task graph associated to a computation is series-parallel, present good analysis properties that can be exploited for scheduling, c...
This paper describes the design of a compiler which can translate out-of-core programs written in a data parallel language like HPF. Such a compiler is required for compiling larg...
Rajeev Thakur, Rajesh Bordawekar, Alok N. Choudhar...