In this paper we consider approximate policy-iteration-based reinforcement learning algorithms. In order to implement a flexible function approximation scheme we propose the use o...
Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csab...
Dynamic Programming, Q-learning and other discrete Markov Decision Process solvers can be applied to continuous d-dimensional state-spaces by quantizing the state space into an arr...
In this paper, we propose a policy gradient reinforcement learning algorithm to address transition-independent Dec-POMDPs. This approach aims at implicitly exploiting the locality...
Abstract. Feature selection in reinforcement learning (RL), i.e. choosing basis functions such that useful approximations of the unkown value function can be obtained, is one of th...
Several reinforcement-learning techniques have already been applied to the Acrobot control problem, using linear function approximators to estimate the value function. In this pape...