Chip multiprocessors designed for streaming applications such as Cell BE offer impressive peak performance but suffer from limited bandwidth to offchip main memory. As the number o...
We present a method to visit all nodes in a forest of data structures while taking into account object placement. We call the technique a Localized Tracing Scheme as it improves lo...
This paper describes a parallel associative processor, IXM2, developed mainly for semantic network processing. IXM2 consists of 64 associative processors and 9 network processors,...
In this paper we investigate a tunable MPI collective communications library on a cluster of SMPs. Most tunable collective communications libraries select optimal algorithms for i...
Full Wave studies of mode conversion (MC) processes in toroidal plasmas have required prohibitive amount of computer resources in the past because of the disparate spatial scales ...
J. C. Wright, P. T. Bonoli, E. D'Azevedo, M. Bramb...