Sciweavers

901 search results - page 40 / 181
» Hiding Communication Latency in Data Parallel Applications
Sort
View
141
Voted
ICUMT
2009
15 years 1 months ago
Two-layer network coordinate system for Internet distance prediction
Network coordinate (NC) system is an efficient and scalable system for Internet distance prediction. In this paper, we propose 3 two-layer NC systems HNPS, HBBS and HIDES derived f...
Chengbo Dong, Guodong Wang, Xuan Zhang, Beixing De...
149
Voted
CCGRID
2008
IEEE
15 years 10 months ago
MPI Collectives on Modern Multicore Clusters: Performance Optimizations and Communication Characteristics
The advances in multicore technology and modern interconnects is rapidly accelerating the number of cores deployed in today’s commodity clusters. A majority of parallel applicat...
Amith R. Mamidala, Rahul Kumar, Debraj De, Dhabale...
137
Voted
ICPP
2005
IEEE
15 years 9 months ago
LiMIC: Support for High-Performance MPI Intra-node Communication on Linux Cluster
High performance intra-node communication support for MPI applications is critical for achieving best performance from clusters of SMP workstations. Present day MPI stacks cannot ...
Hyun-Wook Jin, Sayantan Sur, Lei Chai, Dhabaleswar...
ICS
2005
Tsinghua U.
15 years 9 months ago
Parallel sparse LU factorization on second-class message passing platforms
Several message passing-based parallel solvers have been developed for general (non-symmetric) sparse LU factorization with partial pivoting. Due to the fine-grain synchronizatio...
Kai Shen
185
Voted
FPL
2006
Springer
242views Hardware» more  FPL 2006»
15 years 7 months ago
TMD-MPI: An MPI Implementation for Multiple Processors Across Multiple FPGAs
With current FPGAs, designers can now instantiate several embedded processors, memory units, and a wide variety of IP blocks to build a single-chip, high-performance multiprocesso...
Manuel Saldaña, Paul Chow