Many image and signal processing kernels can be optimized for performance consuming a reasonable area by doing loops parallelization with extensive use of pipelining. This paper p...
Zubair Nawaz, Thomas Marconi, Koen Bertels, Todor ...
Grid applications often need to distribute large amounts of data efficiently from one cluster to multiple others (multicast). Existing methods usually arrange nodes in optimized t...
This paper presents Clusterfile, a parallel file system that provides parallel file access on a cluster of computers. Existing parallel file systems offer little control over matc...
Given the complexity of parallel programs, developers often must rely on performance analysis tools to help them improve the performance of their code. While many tools support th...
We consider the filter decomposition problem in supporting coarse-grained pipelined parallelism. This form of parallelism is suitable for data-driven applications in scenarios wh...