Shared-memory multi-threaded programming is inherently more difficult than single-threaded programming. The main source of complexity is that, the threads of an application can in...
We present experimental evidence that multiple compute-units, compiled from sequential high-level language input programs, can be merged into a reduced number of configurations f...
: This paper describes the development of an Intellectual Property (IP) core in VHDL able to implement a Multilayer Perceptron (MLP) artificial neural network (ANN) topology with u...
The use of multiprocessor tasks (M-tasks) has been shown to be successful for mixed task and data parallel implementations of algorithms from scientific computing. The approach o...
Much of high performance technical computing has moved from shared memory architectures to message based cluster systems. The development and wide adoption of the MPI parallel pro...