Abstract. We reexamine the limits of parallelism available in programs, using runtime reconstruction of program data-flow graphs. While limits of parallelism have been examined in...
Recently, the number of cores on general-purpose processors has been increasing rapidly. Using conventional programming models, it is challenging to effectively exploit these core...
Jayanth Gummaraju, Joel Coburn, Yoshio Turner, Men...
An emerging trend in processor design is the addition of short vector instructions to general-purpose and embedded ISAs. Frequently, these extensions are employed using traditiona...
Samuel Larsen, Rodric M. Rabbah, Saman P. Amarasin...
Programming efficiency of heterogeneous concurrent systems is limited by the use of lock-based synchronization mechanisms. Transactional memories can greatly improve the programmi...
Many standardized hardware communication interfaces offer runtime flexibility and configurability at the cost of efficiency. An alternate approach is the use of a highly-effic...
Steve Ward, Karim Abdalla, Rajeev Dujari, Michael ...