Sciweavers

1443 search results - page 211 / 289
» Improving the Performance of Distributed CORBA Applications
Sort
View
EUROPAR
2010
Springer
15 years 6 months ago
Optimized Dense Matrix Multiplication on a Many-Core Architecture
Abstract. Traditional parallel programming methodologies for improving performance assume cache-based parallel systems. However, new architectures, like the IBM Cyclops-64 (C64), b...
Elkin Garcia, Ioannis E. Venetis, Rishi Khan, Guan...
149
Voted
APPT
2005
Springer
15 years 11 months ago
Static Partitioning vs Dynamic Sharing of Resources in Simultaneous MultiThreading Microarchitectures
Simultaneous MultiThreading (SMT) achieves better system resource utilization and higher performance because it exploits ThreadLevel Parallelism (TLP) in addition to “conventiona...
Chen Liu, Jean-Luc Gaudiot
MOBISYS
2007
ACM
16 years 5 months ago
Context-for-wireless: context-sensitive energy-efficient wireless data transfer
Ubiquitous connectivity on mobile devices will enable numerous new applications in healthcare and multimedia. We set out to check how close we are towards ubiquitous connectivity ...
Ahmad Rahmati, Lin Zhong
262
Voted
ASPLOS
2009
ACM
16 years 6 months ago
Dynamic prediction of collection yield for managed runtimes
The growth in complexity of modern systems makes it increasingly difficult to extract high-performance. The software stacks for such systems typically consist of multiple layers a...
Michal Wegiel, Chandra Krintz
IPPS
2007
IEEE
15 years 11 months ago
Nonuniformly Communicating Noncontiguous Data: A Case Study with PETSc and MPI
Due to the complexity associated with developing parallel applications, scientists and engineers rely on highlevel software libraries such as PETSc, ScaLAPACK and PESSL to ease th...
Pavan Balaji, Darius Buntinas, Satish Balay, Barry...