Using multilayer perceptrons (MLPs) to approximate the state-action value function in reinforcement learning (RL) algorithms could become a nightmare due to the constant possibilit...
We address two open theoretical questions in Policy Gradient Reinforcement Learning. The first concerns the efficacy of using function approximation to represent the state action ...
We focus on neuro-dynamic programming methods to learn state-action value functions and outline some of the inherent problems to be faced, when performing reinforcement learning in...
Abstract. We present first experiments using Support Vector Regression as function approximator for an on-line, sarsa-like reinforcement learner. To overcome the batch nature of S...
This paper presents a direct reinforcement learning algorithm, called Finite-Element Reinforcement Learning, in the continuous case, i.e. continuous state-space and time. The eval...