The demand for high performance has driven acyclic computation accelerators into extensive use in modern embedded and desktop architectures. Accelerators that are ideal from a sof...
This paper describes the implementation of a runtime library for asynchronous communication in the Cell BE processor. The runtime library implementation provides with several servi...
This paper presents the alternatives available to support threadprivate data in OpenMP and evaluates them. We show how current compilation systems rely on custom techniques for im...
Interprocessor synchronization, while extremely important for ensuring execution correctness, can be very costly in terms of both power and performance overheads. Unfortunately, m...
Byte-code based languages are slowly becoming adopted in embedded domains because of improved security and portability. Another potential reason for their adoption is the reputati...