Sciweavers

IEEEPACT
2009
IEEE
13 years 11 months ago
Soft-OLP: Improving Hardware Cache Performance through Software-Controlled Object-Level Partitioning
—Performance degradation of memory-intensive programs caused by the LRU policy’s inability to handle weaklocality data accesses in the last level cache is increasingly serious ...
Qingda Lu, Jiang Lin, Xiaoning Ding, Zhao Zhang, X...
IEEEPACT
2009
IEEE
13 years 11 months ago
Mapping Out a Path from Hardware Transactional Memory to Speculative Multithreading
— This research demonstrates that coming support for hardware transactional memory can be leveraged to significantly reduce the cost of implementing true speculative multithread...
Leo Porter, Bumyong Choi, Dean M. Tullsen
IEEEPACT
2009
IEEE
13 years 11 months ago
Analytical Modeling of Pipeline Parallelism
Parallel programming is a requirement in the multi-core era. One of the most promising techniques to make parallel programming available for the general users is the use of parall...
Angeles G. Navarro, Rafael Asenjo, Siham Tabik, Ca...
IEEEPACT
2009
IEEE
13 years 11 months ago
Automatic Tuning of Discrete Fourier Transforms Driven by Analytical Modeling
—Analytical models have been used to estimate optimal values for parameters such as tile sizes in the context of loop nests. However, important algorithms such as fast Fourier tr...
Basilio B. Fraguela, Yevgen Voronenko, Markus P&uu...
IEEEPACT
2009
IEEE
13 years 11 months ago
CPROB: Checkpoint Processing with Opportunistic Minimal Recovery
—CPR (Checkpoint Processing and Recovery) is a physical register management scheme that supports a larger instruction window and higher average IPC than conventional ROB-style re...
Andrew D. Hilton, Neeraj Eswaran, Amir Roth
IEEEPACT
2009
IEEE
13 years 11 months ago
StealthTest: Low Overhead Online Software Testing Using Transactional Memory
—Software testing is hard. The emergence of multicore architectures and the proliferation of bugprone multithreaded software makes testing even harder. To this end, researchers h...
Jayaram Bobba, Weiwei Xiong, Luke Yen, Mark D. Hil...
IEEEPACT
2009
IEEE
13 years 11 months ago
Polyhedral-Model Guided Loop-Nest Auto-Vectorization
Abstract—Optimizing compilers apply numerous interdependent optimizations, leading to the notoriously difficult phase-ordering problem — that of deciding which transformations...
Konrad Trifunovic, Dorit Nuzman, Albert Cohen, Aya...
IEEEPACT
2009
IEEE
13 years 11 months ago
Data Layout Transformation for Enhancing Data Locality on NUCA Chip Multiprocessors
—With increasing numbers of cores, future CMPs (Chip Multi-Processors) are likely to have a tiled architecture with a portion of shared L2 cache on each tile and a bankinterleave...
Qingda Lu, Christophe Alias, Uday Bondhugula, Thom...
IEEEPACT
2009
IEEE
13 years 11 months ago
Architecture Support for Improving Bulk Memory Copying and Initialization Performance
—Bulk memory copying and initialization is one of the most ubiquitous operations performed in current computer systems by both user applications and Operating Systems. While many...
Xiaowei Jiang, Yan Solihin, Li Zhao, Ravishankar I...
IEEEPACT
2009
IEEE
13 years 11 months ago
Anaphase: A Fine-Grain Thread Decomposition Scheme for Speculative Multithreading
Industry is moving towards multi-core designs as we have hit the memory and power walls. Multi-core designs are very effective to exploit thread-level parallelism (TLP) but do not...
Carlos Madriles, Pedro López, Josep M. Codi...