Sciweavers

931 search results - page 81 / 187
» Compiling for vector-thread architectures
Sort
View
ACMMSP
2004
ACM
92views Hardware» more  ACMMSP 2004»
15 years 10 months ago
Instruction combining for coalescing memory accesses using global code motion
Instruction combining is an optimization to replace a sequence of instructions with a more efficient instruction yielding the same result in a fewer machine cycles. When we use it...
Motohiro Kawahito, Hideaki Komatsu, Toshio Nakatan...
DAC
2005
ACM
15 years 6 months ago
Performance simulation modeling for fast evaluation of pipelined scalar processor by evaluation reuse
This paper proposes a rapid and accurate evaluation scheme for cycle counts of a pipelined processor using evaluation reuse technique. Since exploration of an optimal processor is...
Ho Young Kim, Tag Gon Kim
DATE
2006
IEEE
113views Hardware» more  DATE 2006»
15 years 10 months ago
An interprocedural code optimization technique for network processors using hardware multi-threading support
Sophisticated C compiler support for network processors (NPUs) is required to improve their usability and consequently, their acceptance in system design. Nonetheless, high-level ...
Hanno Scharwächter, Manuel Hohenauer, Rainer ...
ASPLOS
2006
ACM
15 years 10 months ago
Tartan: evaluating spatial computation for whole program execution
Spatial Computing (SC) has been shown to be an energy-efficient model for implementing program kernels. In this paper we explore the feasibility of using SC for more than small k...
Mahim Mishra, Timothy J. Callahan, Tiberiu Chelcea...
ICS
2005
Tsinghua U.
15 years 10 months ago
Think globally, search locally
A key step in program optimization is the determination of optimal values for code optimization parameters such as cache tile sizes and loop unrolling factors. One approach, which...
Kamen Yotov, Keshav Pingali, Paul Stodghill