Sciweavers

1140 search results - page 211 / 228
» Generating compilers for generated datapaths
Sort
View
CC
2009
Springer
106views System Software» more  CC 2009»
15 years 11 months ago
Blind Optimization for Exploiting Hardware Features
Software systems typically exploit only a small fraction of the realizable performance from the underlying microprocessors. While there has been much work on hardware-aware optimiz...
Dan Knights, Todd Mytkowicz, Peter F. Sweeney, Mic...
ICS
2009
Tsinghua U.
15 years 11 months ago
High-performance CUDA kernel execution on FPGAs
In this work, we propose a new FPGA design flow that combines the CUDA programming model from Nvidia with the state of the art high-level synthesis tool AutoPilot from AutoESL, to...
Alexandros Papakonstantinou, Karthik Gururaj, John...
153
Voted
IPPS
2009
IEEE
15 years 11 months ago
Implementing and evaluating multithreaded triad census algorithms on the Cray XMT
Commonly represented as directed graphs, social networks depict relationships and behaviors among social entities such as people, groups, and organizations. Social network analysi...
George Chin Jr., Andrès Márquez, Sut...
CGO
2008
IEEE
15 years 11 months ago
Spice: speculative parallel iteration chunk execution
The recent trend in the processor industry of packing multiple processor cores in a chip has increased the importance of automatic techniques for extracting thread level paralleli...
Easwaran Raman, Neil Vachharajani, Ram Rangan, Dav...
CODES
2008
IEEE
15 years 11 months ago
Static analysis of processor stall cycle aggregation
Processor Idle Cycle Aggregation (PICA) is a promising approach for low power execution of processors, in which small memory stalls are aggregated to create a large one, and the p...
Jongeun Lee, Aviral Shrivastava