In this paper, an efficient VLSI architecture for a fast computation of the 2-D discrete wavelet transform (DWT) is proposed. The architecture employing a three-stage cascade in p...
The current trend in HPC hardware is towards clusters of shared-memory (SMP) compute nodes. For applications developers the major question is how best to program these SMP cluster...
Loops are the main time consuming part of programs based on floating point computations. The performance of the loops is limited either by recurrences in the computation or by the...
The current challenge for crowd simulations is the design and development of a scalable system that is capable of simulating the individual behavior of millions of complex agents ...
In this paper we propose an efficient real-time communication mechanism for distributed vision processing. One of the biggest problems of distributed vision processing, as is the ...