To exploit the potential of multicore architectures, recent dense linear algebra libraries have used tile algorithms, which consist in scheduling a Directed Acyclic Graph (DAG) of...
Bilel Hadri, Hatem Ltaief, Emmanuel Agullo, Jack D...
Abstract. Artificial Neural Networks (ANNs) and image processing requires massively parallel computation of simple operator accompanied by heavy memory access. Thus, this type of ...
Dongsun Kim, Hyunsik Kim, Hongsik Kim, Gunhee Han,...
Stream-based overlay networks (SBONs) are one approach to implementing large-scale stream processing systems. A fundamental consideration in an SBON is that of service placement, ...
Peter R. Pietzuch, Jeffrey Shneidman, Jonathan Led...
We present a pipelining, dynamically usercontrollable reorder operator, for use in dataintensive applications. Allowing the user to reorder the data delivery on the fly increases...
Vijayshankar Raman, Bhaskaran Raman, Joseph M. Hel...
Lack of effective performance-evaluation environments is a major barrier to the broader use of high performance computing. Conventional performance environments are based on profi...