Sciweavers

630 search results - page 57 / 126
» Optimized union of non-disjoint distributed data sets
Sort
View
ICS
2010
Tsinghua U.
15 years 7 days ago
Speeding up Nek5000 with autotuning and specialization
Autotuning technology has emerged recently as a systematic process for evaluating alternative implementations of a computation, in order to select the best-performing solution for...
Jaewook Shin, Mary W. Hall, Jacqueline Chame, Chun...
IPPS
2003
IEEE
15 years 3 months ago
Parallel ROLAP Data Cube Construction On Shared-Nothing Multiprocessors
The pre-computation of data cubes is critical to improving the response time of On-Line Analytical Processing (OLAP) systems and can be instrumental in accelerating data mining tas...
Ying Chen, Frank K. H. A. Dehne, Todd Eavis, Andre...
ICS
2009
Tsinghua U.
15 years 4 months ago
MPI-aware compiler optimizations for improving communication-computation overlap
Several existing compiler transformations can help improve communication-computation overlap in MPI applications. However, traditional compilers treat calls to the MPI library as ...
Anthony Danalis, Lori L. Pollock, D. Martin Swany,...
EUROPAR
2010
Springer
14 years 10 months ago
Optimized On-Chip-Pipelined Mergesort on the Cell/B.E
Abstract. Limited bandwidth to off-chip main memory is a performance bottleneck in chip multiprocessors for streaming computations, such as Cell/B.E., and this will become even mor...
Rikard Hultén, Christoph W. Kessler, Jö...
DAC
1996
ACM
15 years 1 months ago
Optimal Clock Skew Scheduling Tolerant to Process Variations
1- A methodology is presented in this paper for determining an optimal set of clock path delays for designing high performance VLSI/ULSI-based clock distribution networks. This met...
José Luis Neves, Eby G. Friedman