Sciweavers

931 search results - page 92 / 187
» Compiling for vector-thread architectures
Sort
View
CF
2009
ACM
15 years 11 months ago
Mapping the LU decomposition on a many-core architecture: challenges and solutions
Recently, multi-core architectures with alternative memory subsystem designs have emerged. Instead of using hardwaremanaged cache hierarchies, they employ software-managed embedde...
Ioannis E. Venetis, Guang R. Gao
MICRO
2008
IEEE
137views Hardware» more  MICRO 2008»
15 years 11 months ago
Tradeoffs in designing accelerator architectures for visual computing
Visualization, interaction, and simulation (VIS) constitute a class of applications that is growing in importance. This class includes applications such as graphics rendering, vid...
Aqeel Mahesri, Daniel R. Johnson, Neal C. Crago, S...
IEEEPACT
2002
IEEE
15 years 9 months ago
Optimizing Loop Performance for Clustered VLIW Architectures
Modern embedded systems often require high degrees of instruction-level parallelism (ILP) within strict constraints on power consumption and chip cost. Unfortunately, a high-perfo...
Yi Qian, Steve Carr, Philip H. Sweany
ISCAPDCS
2007
15 years 6 months ago
Architectural requirements of parallel computational biology applications with explicit instruction level parallelism
—The tremendous growth in the information culture, efficient digital searches are needed to extract and identify information from huge data. The notion that evolution in silicon ...
Naeem Zafar Azeemi
PPOPP
2010
ACM
15 years 11 months ago
An adaptive performance modeling tool for GPU architectures
This paper presents an analytical model to predict the performance of general-purpose applications on a GPU architecture. The model is designed to provide performance information ...
Sara S. Baghsorkhi, Matthieu Delahaye, Sanjay J. P...