To fully tap into the potential of heterogeneous machines composed of multicore processors and multiple accelerators, simple offloading approaches in which the main trunk of the ap...
Abstract--On NVIDIA's many-core GPUs, there is no synchronization function among parallel thread blocks. When finegranularity of data communication and synchronization is requ...
Jianmin Chen, Zhuo Huang, Feiqi Su, Jih-Kwon Peir,...
Large scale compute clusters continue to grow to ever-increasing proportions. However, as clusters and applications continue to grow, the Mean Time Between Failures (MTBF) has redu...
—Visual categorization is important to manage large collections of digital images and video, where textual metadata is often incomplete or simply unavailable. The bag-of-words mo...
Koen E. A. van de Sande, Theo Gevers, Cees G. M. S...
Queues are commonly used in multithreaded programs for synchronization and communication. However, because software queues tend to be too expensive to support finegrained paralle...
Sanghoon Lee, Devesh Tiwari, Yan Solihin, James Tu...