Sciweavers

ICS
2010
Tsinghua U.
13 years 3 months ago
Decomposable and responsive power models for multicore processors using performance counters
Abstract—Power modeling based on performance monitoring counters (PMCs) has attracted the interest of many researchers since it become a quick approach to understand and analyse ...
Ramon Bertran, Marc González, Xavier Martor...
ICS
2010
Tsinghua U.
13 years 3 months ago
An approach to resource-aware co-scheduling for CMPs
We develop real-time scheduling techniques for improving performance and energy for multiprogrammed workloads that scale nonuniformly with increasing thread counts. Multithreaded ...
Major Bhadauria, Sally A. McKee
ICS
2010
Tsinghua U.
13 years 6 months ago
Timing local streams: improving timeliness in data prefetching
Data prefetching technique is widely used to bridge the growing performance gap between processor and memory. Numerous prefetching techniques have been proposed to exploit data pa...
Huaiyu Zhu, Yong Chen, Xian-He Sun
ICS
2010
Tsinghua U.
13 years 7 months ago
The auction: optimizing banks usage in Non-Uniform Cache Architectures
The growing influence of wire delay in cache design has meant that access latencies to last-level cache banks are no longer constant. Non-Uniform Cache Architectures (NUCAs) have ...
Javier Lira, Carlos Molina, Antonio Gonzále...
ICS
2010
Tsinghua U.
13 years 7 months ago
Speeding up Nek5000 with autotuning and specialization
Autotuning technology has emerged recently as a systematic process for evaluating alternative implementations of a computation, in order to select the best-performing solution for...
Jaewook Shin, Mary W. Hall, Jacqueline Chame, Chun...
ICS
2010
Tsinghua U.
13 years 7 months ago
InterferenceRemoval: removing interference of disk access for MPI programs through data replication
As the number of I/O-intensive MPI programs becomes increasingly large, many efforts have been made to improve I/O performance, on both software and architecture sides. On the sof...
Xuechen Zhang, Song Jiang
ICS
2010
Tsinghua U.
13 years 7 months ago
Clustering performance data efficiently at massive scales
Existing supercomputers have hundreds of thousands of processor cores, and future systems may have hundreds of millions. Developers need detailed performance measurements to tune ...
Todd Gamblin, Bronis R. de Supinski, Martin Schulz...