Sciweavers

14 search results - page 2 / 3
» Signal-to-Noise Ratio Analysis of Policy Gradient Algorithms
Sort
View
ICML
2008
IEEE
14 years 5 months ago
A worst-case comparison between temporal difference and residual gradient with linear function approximation
Residual gradient (RG) was proposed as an alternative to TD(0) for policy evaluation when function approximation is used, but there exists little formal analysis comparing them ex...
Lihong Li
SPAA
2004
ACM
13 years 10 months ago
Packet-mode policies for input-queued switches
This paper considers the problem of packet-mode scheduling of input queuedswitches. Packets have variable lengths, and are divided into cells of unit length. Each packet arrives t...
Dan Guez, Alexander Kesselman, Adi Rosén
GECCO
2009
Springer
162views Optimization» more  GECCO 2009»
13 years 2 months ago
Uncertainty handling CMA-ES for reinforcement learning
The covariance matrix adaptation evolution strategy (CMAES) has proven to be a powerful method for reinforcement learning (RL). Recently, the CMA-ES has been augmented with an ada...
Verena Heidrich-Meisner, Christian Igel
CN
2004
109views more  CN 2004»
13 years 4 months ago
Modeling correlations in web traces and implications for designing replacement policies
A number of web cache-related algorithms, such as replacement and prefetching policies, rely on specific characteristics present in the sequence of requests for efficient performa...
Konstantinos Psounis, An Zhu, Balaji Prabhakar, Ra...
TON
2008
155views more  TON 2008»
13 years 4 months ago
A comparative analysis of server selection in content replication networks
Server selection plays an essential role in content replication networks, such as peer-to-peer (P2P) and content delivery networks (CDNs). In this paper, we perform an analytical i...
Tao Wu, David Starobinski