Although tabular reinforcement learning (RL) methods have been proved to converge to an optimal policy, the combination of particular conventional reinforcement learning techniques...
A number of reinforcement learning algorithms have been developed that are guaranteed to converge to the optimal solution when used with lookup tables. It is shown, however, that ...
We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
This paper presents a direct reinforcement learning algorithm, called Finite-Element Reinforcement Learning, in the continuous case, i.e. continuous state-space and time. The eval...
We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available afte...