Abstract--On NVIDIA's many-core GPUs, there is no synchronization function among parallel thread blocks. When finegranularity of data communication and synchronization is requ...
Jianmin Chen, Zhuo Huang, Feiqi Su, Jih-Kwon Peir,...
We describe the implementation of an out-of-core, distribution-based sorting program on a cluster using FG, a multithreaded programming framework. FG mitigates latency from disk-I/...
Priya Natarajan, Thomas H. Cormen, Elena Riccio St...
Low-latency and high-throughput processing are key requirements of data stream management systems (DSMSs). Hence, multi-core processors that provide high aggregate processing capa...
Programming heterogeneous parallel computer systems is notoriously difficult, but MIMD models have proven to be portable across multi-core processors, clusters, and massively paral...
Current development tools for embedded real-time systems do not efficiently support the timing aspect. The most important timing parameter for scheduling and system analysis is th...
Friedhelm Stappert, Andreas Ermedahl, Jakob Engblo...