This paper evaluates the use of per-node multi-threading to hide remote memory and synchronization latencies in a software DSM. As with hardware systems, multi-threading in softwa...
Abstract—This paper describes an algorithm for deriving data and computation partitions on scalable shared memory multiprocessors. The algorithm establishes affinity relationshi...
We describe how we have parallelized Python, an interpreted object oriented scripting language, and used it to build an extensible message-passing molecular dynamics application f...
Naive parallel implementation of nondeterministic systems (such as a theorem proving system) and languages (such as a logic, constraint, or a concurrent constraint language)can re...
Although platform-independent runtime systems for parallel programming languages are desirable, the need for low-level optimizations usually precludes their existence. This is bec...