Sciweavers

162 search results - page 7 / 33
» Off-Policy Temporal Difference Learning with Function Approx...
Sort
View
88
Voted
ICML
1995
IEEE
15 years 10 months ago
Stable Function Approximation in Dynamic Programming
The success ofreinforcement learninginpractical problems depends on the ability to combine function approximation with temporal di erence methods such as value iteration. Experime...
Geoffrey J. Gordon
NIPS
2008
14 years 11 months ago
Regularized Policy Iteration
In this paper we consider approximate policy-iteration-based reinforcement learning algorithms. In order to implement a flexible function approximation scheme we propose the use o...
Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csab...
82
Voted
CORR
2010
Springer
152views Education» more  CORR 2010»
14 years 9 months ago
Neuroevolutionary optimization
Temporal difference methods are theoretically grounded and empirically effective methods for addressing reinforcement learning problems. In most real-world reinforcement learning ...
Eva Volná
AAAI
2011
13 years 9 months ago
Differential Eligibility Vectors for Advantage Updating and Gradient Methods
In this paper we propose differential eligibility vectors (DEV) for temporal-difference (TD) learning, a new class of eligibility vectors designed to bring out the contribution of...
Francisco S. Melo
ICML
2007
IEEE
15 years 10 months ago
Tracking value function dynamics to improve reinforcement learning with piecewise linear function approximation
Reinforcement learning algorithms can become unstable when combined with linear function approximation. Algorithms that minimize the mean-square Bellman error are guaranteed to co...
Chee Wee Phua, Robert Fitch