Sciweavers

535 search results - page 9 / 107
» The Cache Performance and Optimizations of Blocked Algorithm...
Sort
View
VALUETOOLS
2006
ACM
167views Hardware» more  VALUETOOLS 2006»
15 years 5 months ago
Detailed cache simulation for detecting bottleneck, miss reason and optimization potentialities
Cache locality optimization is an efficient way for reducing the idle time of modern processors in waiting for needed data. This kind of optimization can be achieved either on the...
Jie Tao, Wolfgang Karl
HPCC
2007
Springer
15 years 5 months ago
A Block JRS Algorithm for Highly Parallel Computation of SVDs
This paper presents a new algorithm for computing the singular value decomposition (SVD) on multilevel memory hierarchy architectures. This algorithm is based on one-sided JRS iter...
Mostafa I. Soliman, Sanguthevar Rajasekaran, Reda ...
ALGORITHMICA
2005
84views more  ALGORITHMICA 2005»
14 years 11 months ago
Optimal Read-Once Parallel Disk Scheduling
An optimal prefetching and I/O scheduling algorithm L-OPT, for parallel I/O systems, using a read-once model of block references is presented. The algorithm uses knowledge of the n...
Mahesh Kallahalla, Peter J. Varman
GECCO
2009
Springer
192views Optimization» more  GECCO 2009»
14 years 9 months ago
Improving SMT performance: an application of genetic algorithms to configure resizable caches
Simultaneous Multithreading (SMT) is a technology aimed at improving the throughput of the processor core by applying Instruction Level Parallelism (ILP) and Thread Level Parallel...
Josefa Díaz, José Ignacio Hidalgo, F...
DAC
1998
ACM
16 years 18 days ago
Code Compression for Embedded Systems
Memory is one of the most restricted resources in many modern embedded systems. Code compression can provide substantial savings in terms of size. In a compressed code CPU, a cach...
Haris Lekatsas, Wayne Wolf