At the core of contemporary high performance computer systems is the communication infrastructure. For this reason, there has been a lot of work on providing low-latency, high-ban...
Sven Karlsson, Stavros Passas, George Kotsis, Ange...
Noncontiguous data access is a very common access pattern in many scientific applications. Using POSIX I/O to access many pieces of noncontiguous data segments will generate a lot...
Programming distributed-memory machines requires careful placement of datato balance the computationalload among the nodes and minimize excess data movement between the nodes. Mos...
This paper shows new algorithms and the implementations of image reorganization for EAN/QR barcodes in mobile phones. The mobile phone system used here consists of a camera, mobil...
Abstract. Performance of the on-chip cache is critical for processor. The multithread program model usually employed by on-chip many-core architectures may have effects on cache ac...