Sciweavers

782 search results - page 121 / 157
» Dag-Consistent Distributed Shared Memory
Sort
View
IPPS
2007
IEEE
15 years 4 months ago
Optimizing Inter-Nest Data Locality Using Loop Splitting and Reordering
With the increasing gap between processor speed and memory latency, the performance of data-dominated programs are becoming more reliant on fast data access, which can be improved...
Sofiane Naci
HIPC
2007
Springer
15 years 3 months ago
Optimization of Collective Communication in Intra-cell MPI
: The Cell is a heterogeneous multi-core processor, which has eight co-processors, called SPEs. The SPEs can access a common shared main memory through DMA, and each SPE can direct...
M. K. Velamati, Arun Kumar, Naresh Jayam, Ganapath...
ICPPW
2006
IEEE
15 years 3 months ago
Retargeting Image-Processing Algorithms to Varying Processor Grain Sizes
Embedded computing architectures can be designed to meet a variety of application specific requirements. However, optimized hardware can require compiler support to realize the po...
Sam Sander, Linda M. Wills
IPPS
2006
IEEE
15 years 3 months ago
Parallel implementation and performance characterization of MUSCLE
Multiple sequence alignment is a fundamental and very computationally intensive task in molecular biology. MUSCLE, a new algorithm for creating multiple alignments of protein sequ...
Xi Deng, Eric Li, Jiulong Shan, Wenguang Chen
SPAA
2006
ACM
15 years 3 months ago
The cache complexity of multithreaded cache oblivious algorithms
We present a technique for analyzing the number of cache misses incurred by multithreaded cache oblivious algorithms on an idealized parallel machine in which each processor has a...
Matteo Frigo, Volker Strumpen