A datapath synthesis system (DPSS) for a bus-based wavefront array architecture, called rDPA (reconfigurable datapath architecture), is presented. An internal data bus to the arra...
Many parallel applications from scientific computing use MPI global communication operations to collect or distribute data. Since the execution times of these communication opera...
The caching behavior of multimedia applications has been described as having high instruction reference locality within small loops, very large working sets, and poor data cache p...
This paper presents a novel modeling algorithm that is capable of simultaneously recovering correct shape geometry as well as its unknown topology from arbitrarily complicated dat...
An optimizing compiler has a hard time to generate a code which will perform at top speed for an arbitrary data set size. In general, the low level optimization process must take i...