Sciweavers

2563 search results - page 389 / 513
» Parallel matrix algorithms and applications
Sort
View
ICS
2010
Tsinghua U.
15 years 3 months ago
Clustering performance data efficiently at massive scales
Existing supercomputers have hundreds of thousands of processor cores, and future systems may have hundreds of millions. Developers need detailed performance measurements to tune ...
Todd Gamblin, Bronis R. de Supinski, Martin Schulz...
104
Voted
IESS
2007
Springer
156views Hardware» more  IESS 2007»
15 years 7 months ago
Automatic Data Path Generation from C code for Custom Processors
The stringent performance constraints and short time to market of modern digital systems require automatic methods for design of high performance applicationspecific architectures...
Jelena Trajkovic, Daniel Gajski
123
Voted
SIGIR
2011
ACM
14 years 3 months ago
No free lunch: brute force vs. locality-sensitive hashing for cross-lingual pairwise similarity
This work explores the problem of cross-lingual pairwise similarity, where the task is to extract similar pairs of documents across two different languages. Solutions to this pro...
Ferhan Ture, Tamer Elsayed, Jimmy J. Lin
138
Voted
PPOPP
2009
ACM
16 years 1 months ago
OpenMP to GPGPU: a compiler framework for automatic translation and optimization
GPGPUs have recently emerged as powerful vehicles for generalpurpose high-performance computing. Although a new Compute Unified Device Architecture (CUDA) programming model from N...
Seyong Lee, Seung-Jai Min, Rudolf Eigenmann
141
Voted
ICPADS
2010
IEEE
14 years 10 months ago
Data-Aware Task Scheduling on Multi-accelerator Based Platforms
To fully tap into the potential of heterogeneous machines composed of multicore processors and multiple accelerators, simple offloading approaches in which the main trunk of the ap...
Cédric Augonnet, Jérôme Clet-O...