In this paper we consider approximate policy-iteration-based reinforcement learning algorithms. In order to implement a flexible function approximation scheme we propose the use o...
Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csab...
We identify two fundamental points of utilizing CBR for an adaptive agent that tries to learn on the basis of trial and error without a model of its environment. The first link co...
Dynamic Programming, Q-learning and other discrete Markov Decision Process solvers can be applied to continuous d-dimensional state-spaces by quantizing the state space into an arr...
In this paper, we propose a policy gradient reinforcement learning algorithm to address transition-independent Dec-POMDPs. This approach aims at implicitly exploiting the locality...
Abstract. Feature selection in reinforcement learning (RL), i.e. choosing basis functions such that useful approximations of the unkown value function can be obtained, is one of th...