The model of bulk-synchronous parallel computation (BSP) helps to implement portable general purpose algorithms while keeping predictable performance on different parallel compute...
In this paper, we present two novel methodologies for testing the interconnect fabrics of network-on-chip (NoC) based chips. Both use the concept of recursive testing, with differ...
It is often impossible to obtain a one-size-fits-all solution for high performance algorithms when considering different choices for data distributions, parallelism, transformati...
Jason Ansel, Cy P. Chan, Yee Lok Wong, Marek Olsze...
Abstract. Data sets in large applications are often too massive to t completely inside the computer's internal memory. The resulting input output communication or I O between ...
In past massively parallel processing systems, such as the TMC CM-5 and the CRI T3E, the scheduling problem consisted of allocating a single type of resource among the waiting job...