Sciweavers

42 search results - page 8 / 9
» Improving superword level parallelism support in modern comp...
Sort
View
IEEEPACT
2006
IEEE
13 years 11 months ago
Fast, automatic, procedure-level performance tuning
This paper presents an automated performance tuning solution, which partitions a program into a number of tuning sections and finds the best combination of compiler options for e...
Zhelong Pan, Rudolf Eigenmann
MICRO
2000
IEEE
176views Hardware» more  MICRO 2000»
13 years 5 months ago
An Advanced Optimizer for the IA-64 Architecture
level of abstraction, compared with the program representation for scalar optimizations. For example, loop unrolling and loop unrolland-jam transformations exploit the large regist...
Rakesh Krishnaiyer, Dattatraya Kulkarni, Daniel M....
ISCA
2006
IEEE
125views Hardware» more  ISCA 2006»
13 years 11 months ago
Architectural Semantics for Practical Transactional Memory
Transactional Memory (TM) simplifies parallel programming by allowing for parallel execution of atomic tasks. Thus far, TM systems have focused on implementing transactional stat...
Austen McDonald, JaeWoong Chung, Brian D. Carlstro...
IEEEPACT
2008
IEEE
13 years 12 months ago
Skewed redundancy
Technology scaling in integrated circuits has consistently provided dramatic performance improvements in modern microprocessors. However, increasing device counts and decreasing o...
Gordon B. Bell, Mikko H. Lipasti
FPGA
2007
ACM
124views FPGA» more  FPGA 2007»
13 years 11 months ago
A practical FPGA-based framework for novel CMP research
Chip-multiprocessors are quickly gaining momentum in all segments of computing. However, the practical success of CMPs strongly depends on addressing the difficulty of multithread...
Sewook Wee, Jared Casper, Njuguna Njoroge, Yuriy T...