Dynamic Programming, Q-learning and other discrete Markov Decision Process solvers can be applied to continuous d-dimensional state-spaces by quantizing the state space into an arr...
It is known that the complexity of the reinforcement learning algorithms, such as Q-learning, may be exponential in the number of environment’s states. It was shown, however, th...
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
Abstract— Continuous action sets are used in many reinforcement learning (RL) applications in robot control since the control input is continuous. However, discrete action sets a...
Akihiko Yamaguchi, Jun Takamatsu, Tsukasa Ogasawar...
We consider the task of reinforcement learning in an environment in which rare significant events occur independently of the actions selected by the controlling agent. If these ev...