Sciweavers

26 search results - page 4 / 6
» Loop Optimization using Hierarchical Compilation and Kernel ...
Sort
View
LCPC
2004
Springer
13 years 11 months ago
Empirical Performance-Model Driven Data Layout Optimization
Abstract. Empirical optimizers like ATLAS have been very effective in optimizing computational kernels in libraries. The best choice of parameters such as tile size and degree of l...
Qingda Lu, Xiaoyang Gao, Sriram Krishnamoorthy, Ge...
IPPS
2002
IEEE
13 years 10 months ago
Effective Cross-Platform, Multilevel Parallelism via Dynamic Adaptive Execution
This paper presents preliminary efforts to develop compilation and execution environments that achieve performance portability of multilevel parallelization on hierarchical archit...
Walden Ko, Mark N. Yankelevsky, Dimitrios S. Nikol...
CF
2009
ACM
14 years 9 days ago
Mapping the LU decomposition on a many-core architecture: challenges and solutions
Recently, multi-core architectures with alternative memory subsystem designs have emerged. Instead of using hardwaremanaged cache hierarchies, they employ software-managed embedde...
Ioannis E. Venetis, Guang R. Gao
IPPS
2007
IEEE
14 years 2 days ago
POET: Parameterized Optimizations for Empirical Tuning
The excessive complexity of both machine architectures and applications have made it difficult for compilers to statically model and predict application behavior. This observatio...
Qing Yi, Keith Seymour, Haihang You, Richard W. Vu...
CF
2005
ACM
13 years 7 months ago
A case for a working-set-based memory hierarchy
Modern microprocessor designs continue to obtain impressive performance gains through increasing clock rates and advances in the parallelism obtained via micro-architecture design...
Steve Carr, Soner Önder