Sciweavers

HPCA
2000
IEEE
13 years 7 months ago
Cache-Efficient Matrix Transposition
Siddhartha Chatterjee, Sandeep Sen
HPCA
2000
IEEE
13 years 7 months ago
Modified LRU Policies for Improving Second-Level Cache Behavior
Main memory accesses continue to be a significant bottleneck for applications whose working sets do not fit in second-level caches. With the trend of greater associativity in seco...
Wayne A. Wong, Jean-Loup Baer
HPCA
2000
IEEE
13 years 7 months ago
Evaluation of Active Disks for Decision Support Databases
Growth and usage trends for large decision support databases indicate that there is a need for architectures that scale the processing power as the dataset grows. To meet this nee...
Mustafa Uysal, Anurag Acharya, Joel H. Saltz
HPCA
2000
IEEE
13 years 7 months ago
Software-Controlled Multithreading Using Informing Memory Operations
Memorylatency isbecominganincreasingly importantperformance bottleneck, especially in multiprocessors. One technique for tolerating memory latency is multithreading, whereby we sw...
Todd C. Mowry, Sherwyn R. Ramkissoon
HPCA
2000
IEEE
13 years 7 months ago
Memory Dependence Speculation Tradeoffs in Centralized, Continuous-Window Superscalar Processors
We consider a variety of dynamic, hardware-based methods for exploiting load/store parallelism, including mechanisms that use memory dependence speculation. While previous work ha...
Andreas Moshovos, Gurindar S. Sohi
HPCA
2000
IEEE
13 years 7 months ago
Design of a Parallel Vector Access Unit for SDRAM Memory Systems
We are attacking the memory bottleneck by building a “smart” memory controller that improves effective memory bandwidth, bus utilization, and cache efficiency by letting appl...
Binu K. Mathew, Sally A. McKee, John B. Carter, Al...
HPCA
2000
IEEE
13 years 7 months ago
Register Organization for Media Processing
Processor architectures with tens to hundreds of arithmetic units are emerging to handle media processing applications. These applications, such as image coding, image synthesis, ...
Scott Rixner, William J. Dally, Brucek Khailany, P...
HPCA
2000
IEEE
13 years 7 months ago
Improving the Throughput of Synchronization by Insertion of Delays
Efficiency of synchronization mechanisms can limit the parallel performance of many shared-memory applications. In addition, the ever increasing performance gap between processor...
Ravi Rajwar, Alain Kägi, James R. Goodman
HPCA
2000
IEEE
13 years 7 months ago
Decoupled Value Prediction on Trace Processors
Value prediction is a technique that breaks true data dependences by predicting the outcome of an instruction, and executes speculatively its data-dependent instructions based on ...
Sang Jeong Lee, Yuan Wang, Pen-Chung Yew
HPCA
2000
IEEE
13 years 7 months ago
Flit-Reservation Flow Control
This paper presents flit-reservation flow control, in which control flits traverse the network in advance of data flits, reserving buffers and channel bandwidth. Flit-reservation ...
Li-Shiuan Peh, William J. Dally