Search Sciweavers | Sciweavers

35 search results - page 4 / 7

» Loop Scheduling and Partitions for Hiding Memory Latencies

click to vote

ICPP
1998
IEEE

222views Distributed And Parallel Com...» more ICPP 1998»

A memory-layout oriented run-time technique for locality optimization

13 years 10 months ago

Download home.eng.iastate.edu

Exploiting locality at run-time is a complementary approach to a compiler approach for those applications with dynamic memory access patterns. This paper proposes a memory-layout ...

Yong Yan, Xiaodong Zhang, Zhao Zhang

claim paper

Read More »

click to vote

ISCA
2010
IEEE

185views Hardware» more ISCA 2010»

Dynamic warp subdivision for integrated branch and memory divergence tolerance

13 years 11 months ago

Download www.cs.virginia.edu

SIMD organizations amortize the area and power of fetch, decode, and issue logic across multiple processing units in order to maximize throughput for a given area and power budget...

Jiayuan Meng, David Tarjan, Kevin Skadron

claim paper

Read More »

click to vote

ICS
2007
Tsinghua U.

171views Distributed And Parallel Com...» more ICS 2007»

Automatic nonblocking communication for partitioned global address space programs

14 years 9 days ago

Download www.eecs.berkeley.edu

Overlapping communication with computation is an important optimization on current cluster architectures; its importance is likely to increase as the doubling of processing power ...

Wei-Yu Chen, Dan Bonachea, Costin Iancu, Katherine...

claim paper

Read More »

click to vote

MICRO
2002
IEEE

143views Hardware» more MICRO 2002»

Effective instruction scheduling techniques for an interleaved cache clustered VLIW processor

13 years 11 months ago

Download personals.ac.upc.edu

Clustering is a common technique to overcome the wire delay problem incurred by the evolution of technology. Fully-distributed architectures, where the register ﬁle, the functio...

Enric Gibert, F. Jesús Sánchez, Anto...

claim paper

Read More »

click to vote

DAC
2000
ACM

112views Computer Architecture» more DAC 2000»

Memory aware compilation through accurate timing extraction

14 years 7 months ago

Download www.cs.ucr.edu

Memory delays represent a major bottleneck in embedded systems performance. Newer memory modules exhibiting efficient access modes (e.g., page-, burst-mode) partly alleviate this ...

Peter Grun, Nikil D. Dutt, Alexandru Nicolau

claim paper

Read More »

« Prev « First page 4 / 7 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers