Sciweavers

2155 search results - page 1 / 431
» The EM-X Parallel Computer: Architecture and Basic Performan...
Sort
View
ISCA
1995
IEEE
118views Hardware» more  ISCA 1995»
13 years 7 months ago
The EM-X Parallel Computer: Architecture and Basic Performance
Latency tolerance is essential in achieving high performance on parallel computers for remote function calls and fine-grained remote memory accesses. EM-X supports interprocessor ...
Yuetsu Kodama, Hirohumi Sakane, Mitsuhisa Sato, Ha...
ECOOPW
1998
Springer
13 years 8 months ago
A Rational Approach to Portable High Performance: The Basic Linear Algebra Instruction Set (BLAIS) and the Fixed Algorithm Size
Abstract. We introduce a collection of high performance kernels for basic linear algebra. The kernels encapsulate small xed size computations in order to provide building blocks fo...
Jeremy G. Siek, Andrew Lumsdaine
CONCURRENCY
2000
106views more  CONCURRENCY 2000»
13 years 4 months ago
Performance characteristics for OpenMP constructs on different parallel computer architectures
OpenMP is emerging as a quasi-standard for shared memory parallel programming on small SMP-systems. To serve as a common programming interface in shared memory parallel programmin...
Rudolf Berrendorf, Guido Nieken
PARA
1995
Springer
13 years 7 months ago
A Proposal for a Set of Parallel Basic Linear Algebra Subprograms
This paper describes a proposal for a set of Parallel Basic Linear Algebra Subprograms PBLAS. The PBLAS are targeted at distributed vector-vector, matrix-vector and matrixmatrix...
Jaeyoung Choi, Jack Dongarra, Susan Ostrouchov, An...
CCGRID
2001
IEEE
13 years 8 months ago
TACO-Exploiting Cluster Networks for High-Level Collective Operations
TACO (Topologies and Collections) is a template library that introduces the flavour of distributed data parallel processing by means of reusable topology classes and C++ s. This p...
Jörg Nolte, Mitsuhisa Sato, Yutaka Ishikawa