Sciweavers

421 search results - page 60 / 85
» An Intelligent Parallel Loop Scheduling for Parallelizing Co...
Sort
View
IWOMP
2009
Springer
15 years 4 months ago
Evaluating OpenMP 3.0 Run Time Systems on Unbalanced Task Graphs
The UTS benchmark is used to evaluate task parallelism in OpenMP 3.0 as implemented in a number of recently released compilers and run-time systems. UTS performs parallel search of...
Stephen Olivier, Jan Prins
ICS
2001
Tsinghua U.
15 years 2 months ago
Computer aided hand tuning (CAHT): "applying case-based reasoning to performance tuning"
For most parallel and high performance systems, tuning guides provide the users with advices to optimize the execution time of their programs. Execution time may be very sensitive...
Antoine Monsifrot, François Bodin
ICS
2004
Tsinghua U.
15 years 3 months ago
Applications of storage mapping optimization to register promotion
Storage mapping optimization is a flexible approach to folding array dimensions in numerical codes. It is designed to reduce the memory footprint after a wide spectrum of loop tr...
Patrick Carribault, Albert Cohen
HPCA
2009
IEEE
15 years 10 months ago
Bridging the computation gap between programmable processors and hardwired accelerators
New media and signal processing applications demand ever higher performance while operating within the tight power constraints of mobile devices. A range of hardware implementatio...
Kevin Fan, Manjunath Kudlur, Ganesh S. Dasika, Sco...
ISLPED
2005
ACM
93views Hardware» more  ISLPED 2005»
15 years 3 months ago
Power-aware code scheduling for clusters of active disks
In this paper, we take the idea of application-level processing on disks to one level further, and focus on an architecture, called Cluster of Active Disks (CAD), where the storag...
Seung Woo Son, Guangyu Chen, Mahmut T. Kandemir