Sciweavers

677 search results - page 116 / 136
» Distributed data-parallel computing using a high-level progr...
Sort
View
147
Voted
CF
2006
ACM
15 years 4 months ago
Intermediately executed code is the key to find refactorings that improve temporal data locality
The growing speed gap between memory and processor makes an efficient use of the cache ever more important to reach high performance. One of the most important ways to improve cac...
Kristof Beyls, Erik H. D'Hollander
SIGCSE
2009
ACM
167views Education» more  SIGCSE 2009»
16 years 2 months ago
Python CS1 as preparation for C++ CS2
How suitable is a Python-based CS1 course as preparation for a C++-based CS2 course? After fifteen years of using C++ for both CS1 and CS2, the Computer Science Department at Mich...
Richard J. Enbody, William F. Punch, Mark McCullen
SPAA
2009
ACM
16 years 2 months ago
Towards transactional memory semantics for C++
Transactional memory (TM) eliminates many problems associated with lock-based synchronization. Over recent years, much progress has been made in software and hardware implementati...
Tatiana Shpeisman, Ali-Reza Adl-Tabatabai, Robert ...
ICS
2009
Tsinghua U.
15 years 9 months ago
High-performance CUDA kernel execution on FPGAs
In this work, we propose a new FPGA design flow that combines the CUDA programming model from Nvidia with the state of the art high-level synthesis tool AutoPilot from AutoESL, to...
Alexandros Papakonstantinou, Karthik Gururaj, John...
PPOPP
1990
ACM
15 years 6 months ago
Concurrent Aggregates (CA)
Toprogrammassivelyconcurrent MIMDmachines, programmersneed tools for managingcomplexity. One important tool that has been used in the sequential programmingworld is hierarchies of...
Andrew A. Chien, William J. Dally