Abstract. Profiling is often the method of choice for performance analysis of parallel applications due to its low overhead and easily comprehensible results. However, a disadvanta...
With reference to an object type defining the two basic operations, read and write, we present solutions to the object sharing problem, classified according to the migration and/o...
To achieve optimal performance, garbage-collected applications must balance the sizes of their heaps dynamically. Sizing the heap too small can reduce throughput by increasing the...
Matthew Hertz, Stephen Kane, Elizabeth Keudel, Ton...
Abstract. In this paper we present a parallel algorithm for the topological watershed, suitable for a shared memory parallel architecture. On a 24-core machine an average speed-up ...
The memory consistency model supported by a multiprocessor architecture determines the amount of buffering and pipelining that may be used to hide or reduce the latency of memory ...
Kourosh Gharachorloo, Anoop Gupta, John L. Henness...