We introduce techniques to support efficient non-atomic execution of very long traces on a new binary translation based, x86-64 compatible VLIW microprocessor. Incrementally comm...
High-level synthesis (HLS) has been successfully targeted towards the digital signal processing (DSP) domain. Both application-specic integrated circuits (ASICs) and application-...
Cache misses form a major bottleneck for memory-intensive applications, due to the significant latency of main memory accesses. Loop tiling, in conjunction with other program tran...
Recently, skyline queries have attracted much attention in the database research community. Space partitioning techniques, such as recursive division of the data space, have been ...
Hardware counters are a fundamental building block of modern high-performance processors. This paper explores two applications of probabilistic counter updates, in which the outpu...