Sciweavers

22 search results - page 4 / 5
» Automatic Partitioning of Parallel Loops with Parallelepiped...
Sort
View
CGO
2010
IEEE
14 years 4 days ago
Decoupled software pipelining creates parallelization opportunities
Decoupled Software Pipelining (DSWP) is one approach to automatically extract threads from loops. It partitions loops into long-running threads that communicate in a pipelined man...
Jialu Huang, Arun Raman, Thomas B. Jablin, Yun Zha...
PVM
1999
Springer
13 years 9 months ago
JPT: A Java Parallelization Tool
Abstract. PVM is a succesfull programming environment for distributed computing in the languages C and Fortran. Recently several implementations of PVM for Java have been added, ma...
Kristof Beyls, Erik H. D'Hollander, Yijun Yu
IEEEPACT
2009
IEEE
13 years 12 months ago
Automatic Tuning of Discrete Fourier Transforms Driven by Analytical Modeling
—Analytical models have been used to estimate optimal values for parameters such as tile sizes in the context of loop nests. However, important algorithms such as fast Fourier tr...
Basilio B. Fraguela, Yevgen Voronenko, Markus P&uu...
IEEEPACT
2007
IEEE
13 years 11 months ago
Performance Portable Optimizations for Loops Containing Communication Operations
Effective use of communication networks is critical to the performance and scalability of parallel applications. Partitioned Global Address Space languages like UPC bring the pro...
Costin Iancu, Wei Chen, Katherine A. Yelick
ISHPC
2003
Springer
13 years 10 months ago
Code and Data Transformations for Improving Shared Cache Performance on SMT Processors
Simultaneous multithreaded processors use shared on-chip caches, which yield better cost-performance ratios. Sharing a cache between simultaneously executing threads causes excessi...
Dimitrios S. Nikolopoulos