Sciweavers

PPOPP
2010
ACM
14 years 2 months ago
Data transformations enabling loop vectorization on multithreaded data parallel architectures
Loop vectorization, a key feature exploited to obtain high performance on Single Instruction Multiple Data (SIMD) vector architectures, is significantly hindered by irregular memo...
Byunghyun Jang, Perhaad Mistry, Dana Schaa, Rodrig...
PPOPP
2010
ACM
14 years 3 days ago
Thread to strand binding of parallel network applications in massive multi-threaded systems
In processors with several levels of hardware resource sharing, like CMPs in which each core is an SMT, the scheduling process becomes more complex than in processors with a singl...
Petar Radojkovic, Vladimir Cakarevic, Javier Verd&...
PPOPP
2010
ACM
14 years 2 months ago
A distributed placement service for graph-structured and tree-structured data
Effective data placement strategies can enhance the performance of data-intensive applications implemented on high end computing clusters. Such strategies can have a significant i...
Gregory Buehrer, Srinivasan Parthasarathy, Shirish...
PPOPP
2010
ACM
14 years 2 months ago
A practical concurrent binary search tree
We propose a concurrent relaxed balance AVL tree algorithm that is fast, scales well, and tolerates contention. It is based on optimistic techniques adapted from software transact...
Nathan Grasso Bronson, Jared Casper, Hassan Chafi,...
PPOPP
2010
ACM
14 years 14 days ago
An adaptive performance modeling tool for GPU architectures
This paper presents an analytical model to predict the performance of general-purpose applications on a GPU architecture. The model is designed to provide performance information ...
Sara S. Baghsorkhi, Matthieu Delahaye, Sanjay J. P...
Distributed And Parallel Computing
Top of PageReset Settings