Sciweavers

914 search results - page 137 / 183
» Assessing the performance limits of parallelized near-thresh...
Sort
View
90
Voted
CLUSTER
2007
IEEE
15 years 4 months ago
Efficient asynchronous memory copy operations on multi-core systems and I/OAT
Bulk memory copies incur large overheads such as CPU stalling (i.e., no overlap of computation with memory copy operation), small register-size data movement, cache pollution, etc...
Karthikeyan Vaidyanathan, Lei Chai, Wei Huang, Dha...
CF
2010
ACM
15 years 5 months ago
Proposition for a sequential accelerator in future general-purpose manycore processors and the problem of migration-induced cach
As the number of transistors on a chip doubles with every technology generation, the number of on-chip cores also increases rapidly, making possible in a foreseeable future to des...
Pierre Michaud, Yiannakis Sazeides, André S...
105
Voted
CGO
2007
IEEE
15 years 6 months ago
Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time
Emerging microprocessors offer unprecedented parallel computing capabilities and deeper memory hierarchies, increasing the importance of loop transformations in optimizing compile...
Louis-Noël Pouchet, Cédric Bastoul, Al...
ICPADS
2005
IEEE
15 years 6 months ago
Efficient Distributed QoS Routing Protocol for MPLS Networks
- This paper proposes a new distributed QoS routing protocol, called Efficient Distributed QoS Routing (EDQR), for MPLS networks. The path searching algorithm of EDQR considers an ...
Man-Ching Yuen, Weijia Jia, Chi-Chung Cheung
99
Voted
IPPS
2009
IEEE
15 years 7 months ago
Double Throughput Multiply-Accumulate unit for FlexCore processor enhancements
— As a simple five-stage General-Purpose Processor (GPP), the baseline FlexCore processor has a limited set of datapath units. By utilizing a flexible datapath interconnect and...
Tung Thanh Hoang, Magnus Själander, Per Larss...