This paper presents a dynamic task scheduling approach to executing dense linear algebra algorithms on multicore systems (either shared-memory or distributed-memory). We use a tas...
Power and energy are first-order design constraints in high performance computing. Current research using dynamic voltage scaling (DVS) relies on trading increased execution time...
Barry Rountree, David K. Lowenthal, Bronis R. de S...
—We have proposed an auto-memoization processor. This processor automatically and dynamically memoizes both functions and loop iterations, and skips their execution by reusing th...
Understanding the potential and implications of asymmetric multi-core processors for cluster computing is necessary, as these processors are rapidly becoming mainstream components...
Filip Blagojevic, Matthew Curtis-Maury, Jae-Seung ...
The demand for high performance has driven acyclic computation accelerators into extensive use in modern embedded and desktop architectures. Accelerators that are ideal from a sof...