In this paper we give a new run–time technique for finding an optimal parallel execution schedule for a partially parallel loop, i.e., a loop whose parallelization requires syn...
Lawrence Rauchwerger, Nancy M. Amato, David A. Pad...
Abstract. We describe compiler and run-time optimisations for effective autoparallelisation of C++ programs on the Cell BE architecture. Auto-parallelisation is made easier by anno...
Background: Gender differences in gene expression were estimated in liver samples from 9 males and 9 females. The study tested 31,110 genes for a gender difference using a design ...
Robert R. Delongchamp, Cruz Velasco, Stacey Dial, ...
In this paper we present a multi-grained parallel algorithm for computing betweenness centrality, which is extensively used in large-scale network analysis. Our method is based on ...
Large-scale CMPs with hundreds of cores require a directory-based protocol to maintain cache coherence. However, previously proposed coherence directories are hard to scale beyond...