Sciweavers

1140 search results - page 211 / 228
» Generating compilers for generated datapaths
Sort
View
124
Voted
CC
2009
Springer
106views System Software» more  CC 2009»
15 years 9 months ago
Blind Optimization for Exploiting Hardware Features
Software systems typically exploit only a small fraction of the realizable performance from the underlying microprocessors. While there has been much work on hardware-aware optimiz...
Dan Knights, Todd Mytkowicz, Peter F. Sweeney, Mic...
124
Voted
ICS
2009
Tsinghua U.
15 years 9 months ago
High-performance CUDA kernel execution on FPGAs
In this work, we propose a new FPGA design flow that combines the CUDA programming model from Nvidia with the state of the art high-level synthesis tool AutoPilot from AutoESL, to...
Alexandros Papakonstantinou, Karthik Gururaj, John...
134
Voted
IPPS
2009
IEEE
15 years 9 months ago
Implementing and evaluating multithreaded triad census algorithms on the Cray XMT
Commonly represented as directed graphs, social networks depict relationships and behaviors among social entities such as people, groups, and organizations. Social network analysi...
George Chin Jr., Andrès Márquez, Sut...
141
Voted
CGO
2008
IEEE
15 years 9 months ago
Spice: speculative parallel iteration chunk execution
The recent trend in the processor industry of packing multiple processor cores in a chip has increased the importance of automatic techniques for extracting thread level paralleli...
Easwaran Raman, Neil Vachharajani, Ram Rangan, Dav...
133
Voted
CODES
2008
IEEE
15 years 9 months ago
Static analysis of processor stall cycle aggregation
Processor Idle Cycle Aggregation (PICA) is a promising approach for low power execution of processors, in which small memory stalls are aggregated to create a large one, and the p...
Jongeun Lee, Aviral Shrivastava