The variables of the high-level specifications and the automatically generated temporary variables are mapped on to the data-path registers during data-path synthesis phase of hig...
Chandan Karfa, Chittaranjan A. Mandal, Dipankar Sa...
To continue to improve processor performance, microarchitects seek to increase the effective instruction level parallelism (ILP) that can be exploited in applications. A fundament...
Tiling has proven to be an effective mechanism to develop high performance implementations of algorithms. Tiling can be used to organize computations so that communication costs i...
Ganesh Bikshandi, Jia Guo, Daniel Hoeflinger, Gheo...
—Real-time encoding of high-definition H.264 video is a challenge to current embedded programmable processors. Emerging stream processing methods supported by most GPUs and progr...
Ju Ren, Yi He, Wei Wu, Mei Wen, Nan Wu, Chunyuan Z...
Traditional DMA requires the operating system to perform many tasks to initiate a transfer, with overhead on the order of hundreds or thousands of CPU instructions. This paper des...
Matthias A. Blumrich, Cezary Dubnicki, Edward W. F...