Sciweavers

2784 search results - page 224 / 557
» Instruction Level Parallelism
Sort
View
IEEEINTERACT
2003
IEEE
15 years 9 months ago
Procedure Cloning and Integration for Converting Parallelism from Coarse to Fine Grain
This paper introduces a method for improving program run-time performance by gathering work in an application and executing it efficiently in an integrated thread. Our methods ext...
Won So, Alexander G. Dean
HPCA
2008
IEEE
16 years 4 months ago
PEEP: Exploiting predictability of memory dependences in SMT processors
Simultaneous Multithreading (SMT) attempts to keep a dynamically scheduled processor's resources busy with work from multiple independent threads. Threads with longlatency st...
Samantika Subramaniam, Milos Prvulovic, Gabriel H....
MICRO
2007
IEEE
108views Hardware» more  MICRO 2007»
15 years 10 months ago
FPGA-Accelerated Simulation Technologies (FAST): Fast, Full-System, Cycle-Accurate Simulators
This paper describes FAST, a novel simulation methodology that can produce simulators that (i) are orders of magnitude faster than comparable simulators, (ii) are cycleaccurate, (...
Derek Chiou, Dam Sunwoo, Joonsoo Kim, Nikhil A. Pa...
ISCA
2005
IEEE
166views Hardware» more  ISCA 2005»
15 years 9 months ago
Increased Scalability and Power Efficiency by Using Multiple Speed Pipelines
One of the most important problems faced by microarchitecture designers is the poor scalability of some of the current solutions with increased clock frequencies and wider pipelin...
Emil Talpes, Diana Marculescu
ISCAS
2005
IEEE
152views Hardware» more  ISCAS 2005»
15 years 9 months ago
Dictionary-based program compression on transport triggered architectures
— Program code size has become a critical design constraint of embedded systems. Large program codes require large memories, which increase the size and cost of the chip. Poor co...
Jari Heikkinen, Andrea G. M. Cilio, Jarmo Takala, ...