This paper presents the first automatic scheme to allocate local (stack) data in recursive functions to scratch-pad memory (SPM) in embedded systems. A scratch-pad is a fast direct...
Embedded platforms are resource-constrained systems in which performance and memory requirements of executed code are of critical importance. However, standard techniques such as ...
Carmen Badea, Alexandru Nicolau, Alexander V. Veid...
Delayed branching is a technique to alleviate branch hazards without expensive hardware branch prediction mechanisms. For VLIW processors with deep pipelines and many issue slots,...
Dynamic binary translation (DBT) has been used to achieve numerous goals (e.g., better performance) for general-purpose computers. Recently, DBT has also attracted attention for e...
Prior work on modeling interconnects has focused on optimizing the wire and repeater design for trading off energy and delay, and is largely based on low level circuit parameters....
Rahul Nagpal, Arvind Madan, Bharadwaj Amrutur, Y. ...