In retargeting loop-based code for multimedia instruction set extensions, a critical issue is that vector data types of mixed precision within a loop body complicate the paralleli...
We propose general purposes natural heuristics for static block and block-cyclic heterogeneous data decomposition over processes of parallel program mapped into multidimensional g...
—When parallel programs are executed on multiprocessors with private caches, a set of data may be repeatedly used and modified by different threads. Such data sharing can often r...
With the shrinking of transistors continuing to follow Moore's Law and the non-scalability of conventional outof-order processors, multi-core systems are becoming the design ...
Bulk transport underlies data exfiltration and code update facilities in WSNs, but existing approaches are not designed for highly lossy and variable-quality links. We observe tha...