Sciweavers

67 search results - page 1 / 14
» Data transformations enabling loop vectorization on multithr...
Sort
View
PPOPP
2010
ACM
14 years 2 months ago
Data transformations enabling loop vectorization on multithreaded data parallel architectures
Loop vectorization, a key feature exploited to obtain high performance on Single Instruction Multiple Data (SIMD) vector architectures, is significantly hindered by irregular memo...
Byunghyun Jang, Perhaad Mistry, Dana Schaa, Rodrig...
NPC
2010
Springer
13 years 3 months ago
Exposing Tunable Parameters in Multi-threaded Numerical Code
Achieving high performance on today’s architectures requires careful orchestration of many optimization parameters. In particular, the presence of shared-caches on multicore arch...
Apan Qasem, Jichi Guo, Faizur Rahman, Qing Yi
IPPS
2010
IEEE
13 years 2 months ago
Restructuring parallel loops to curb false sharing on multicore architectures
The memory hierarchy of most multicore systems contains one or more levels of cache that is shared among multiple cores. The shared-cache architecture presents many opportunities f...
Santosh Sarangkar, Apan Qasem
DATE
2006
IEEE
159views Hardware» more  DATE 2006»
13 years 11 months ago
Distributed loop controller architecture for multi-threading in uni-threaded VLIW processors
Reduced energy consumption is one of the most important design goals for embedded application domains like wireless, multimedia and biomedical. Instruction memory hierarchy has be...
Praveen Raghavan, Andy Lambrechts, Murali Jayapala...
ISCA
2008
IEEE
148views Hardware» more  ISCA 2008»
13 years 11 months ago
Atomic Vector Operations on Chip Multiprocessors
The current trend is for processors to deliver dramatic improvements in parallel performance while only modestly improving serial performance. Parallel performance is harvested th...
Sanjeev Kumar, Daehyun Kim, Mikhail Smelyanskiy, Y...