Sciweavers

54 search results - page 3 / 11
» Convergence Results for Single-Step On-Policy Reinforcement-...
Sort
View
IJCAI
2001
13 years 7 months ago
Exploiting Multiple Secondary Reinforcers in Policy Gradient Reinforcement Learning
Most formulations of Reinforcement Learning depend on a single reinforcement reward value to guide the search for the optimal policy solution. If observation of this reward is rar...
Gregory Z. Grudic, Lyle H. Ungar
AAMAS
2007
Springer
14 years 10 days ago
Continuous-State Reinforcement Learning with Fuzzy Approximation
Abstract. Reinforcement learning (RL) is a widely used learning paradigm for adaptive agents. There exist several convergent and consistent RL algorithms which have been intensivel...
Lucian Busoniu, Damien Ernst, Bart De Schutter, Ro...
ICML
2007
IEEE
14 years 7 months ago
Tracking value function dynamics to improve reinforcement learning with piecewise linear function approximation
Reinforcement learning algorithms can become unstable when combined with linear function approximation. Algorithms that minimize the mean-square Bellman error are guaranteed to co...
Chee Wee Phua, Robert Fitch
CORR
2012
Springer
216views Education» more  CORR 2012»
12 years 1 months ago
Fractional Moments on Bandit Problems
Reinforcement learning addresses the dilemma between exploration to find profitable actions and exploitation to act according to the best observations already made. Bandit proble...
Ananda Narayanan B., Balaraman Ravindran
SIAMCO
2000
117views more  SIAMCO 2000»
13 years 6 months ago
The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
It is shown here that stability of the stochastic approximation algorithm is implied by the asymptotic stability of the origin for an associated ODE. This in turn implies convergen...
Vivek S. Borkar, Sean P. Meyn