Sciweavers

2703 search results - page 367 / 541
» Optimizing memory transactions
Sort
View
148
Voted
CF
2009
ACM
15 years 10 months ago
Mapping the LU decomposition on a many-core architecture: challenges and solutions
Recently, multi-core architectures with alternative memory subsystem designs have emerged. Instead of using hardwaremanaged cache hierarchies, they employ software-managed embedde...
Ioannis E. Venetis, Guang R. Gao
134
Voted
ASPLOS
2010
ACM
15 years 10 months ago
COMPASS: a programmable data prefetcher using idle GPU shaders
A traditional fixed-function graphics accelerator has evolved into a programmable general-purpose graphics processing unit over the last few years. These powerful computing cores...
Dong Hyuk Woo, Hsien-Hsin S. Lee
146
Voted
FPGA
2007
ACM
150views FPGA» more  FPGA 2007»
15 years 9 months ago
FPGA-friendly code compression for horizontal microcoded custom IPs
Shrinking time-to-market and high demand for productivity has driven traditional hardware designers to use design methodologies that start from high-level languages. However, meet...
Bita Gorjiara, Daniel Gajski
239
Voted
ICS
2007
Tsinghua U.
15 years 9 months ago
Scheduling FFT computation on SMP and multicore systems
Increased complexity of memory systems to ameliorate the gap between the speed of processors and memory has made it increasingly harder for compilers to optimize an arbitrary code...
Ayaz Ali, S. Lennart Johnsson, Jaspal Subhlok
146
Voted
FCCM
2005
IEEE
139views VLSI» more  FCCM 2005»
15 years 9 months ago
A Study of the Scalability of On-Chip Routing for Just-in-Time FPGA Compilation
Just-in-time (JIT) compilation has been used in many applications to enable standard software binaries to execute on different underlying processor architectures. We previously in...
Roman L. Lysecky, Frank Vahid, Sheldon X.-D. Tan