Sciweavers

581 search results - page 99 / 117
» Implementing the Best Processor Cores
Sort
View
TPDS
2008
134views more  TPDS 2008»
15 years 1 months ago
Extending the TokenCMP Cache Coherence Protocol for Low Overhead Fault Tolerance in CMP Architectures
It is widely accepted that transient failures will appear more frequently in chips designed in the near future due to several factors such as the increased integration scale. On th...
Ricardo Fernández Pascual, José M. G...
CGO
2005
IEEE
15 years 7 months ago
Optimizing Sorting with Genetic Algorithms
The growing complexity of modern processors has made the generation of highly efficient code increasingly difficult. Manual code generation is very time consuming, but it is oft...
Xiaoming Li, María Jesús Garzar&aacu...
IPPS
2010
IEEE
14 years 11 months ago
Optimization of linked list prefix computations on multithreaded GPUs using CUDA
We present a number of optimization techniques to compute prefix sums on linked lists and implement them on multithreaded GPUs using CUDA. Prefix computations on linked structures ...
Zheng Wei, Joseph JáJá
DCC
2007
IEEE
16 years 1 months ago
Algorithms and Hardware Structures for Unobtrusive Real-Time Compression of Instruction and Data Address Traces
Instruction and data address traces are widely used by computer designers for quantitative evaluations of new architectures and workload characterization, as well as by software de...
Milena Milenkovic, Aleksandar Milenkovic, Martin B...
ISCA
2007
IEEE
198views Hardware» more  ISCA 2007»
15 years 8 months ago
Making the fast case common and the uncommon case simple in unbounded transactional memory
Hardware transactional memory has great potential to simplify the creation of correct and efficient multithreaded programs, allowing programmers to exploit more effectively the s...
Colin Blundell, Joe Devietti, E. Christopher Lewis...