Clusters of workstations have become a cost-effective means of performing scientific computations. However, large network latencies, resource sharing, and heterogeneity found in ...
We propose an interconnect reorganization algorithm for reduction stages in parallel multipliers. It aims at minimizing power consumption for given static probabilities at the pri...
Saeeid Tahmasbi Oskuii, Per Gunnar Kjeldsberg, Osc...
Discovering complex associations, anomalies and patterns in distributed data sets is gaining popularity in a range of scientific, medical and business applications. Various algor...
Omer F. Rana, David W. Walker, Maozhen Li, Steven ...
All Netflix Prize algorithms proposed so far are prohibitively costly for large-scale production systems. In this paper, we describe an efficient dataflow implementation of a coll...
Srivatsava Daruru, Nena M. Marin, Matt Walker, Joy...
Efficient communication and synchronization is crucial for finegrained parallelism. Libraries providing such features, while indispensable, are difficult to write, and often ca...