In this paper, we present an efficient procedure for building a piecewise linear function approximation of the speed function of a processor with hierarchical memory structure. Th...
Abstract. Hardware realization of kernel loops holds the promise of accelerating the overall application performance and is therefore an important part of the synthesis process. In...
Abstract. One of the most important collective communication patterns for scientific applications is the many to many, also called complete exchange. Although efficient All-to-All...
In this paper, we consider the problem of blind multiuser separation-equalization in the uplink of a wideband DS-CDMA system, in a multipath propagation environment with intersymbo...
Concurrent collection classes are widely used in multi-threaded programming, but they provide atomicity only for a fixed set of operations. Software transactional memory (STM) pr...
Nathan Grasso Bronson, Jared Casper, Hassan Chafi,...