The learning classifier system XCS is an iterative rulelearning system that evolves rule structures based on gradient-based prediction and rule quality estimates. Besides classifi...
Martin V. Butz, Pier Luca Lanzi, Stewart W. Wilson
Reinforcement learning algorithms can become unstable when combined with linear function approximation. Algorithms that minimize the mean-square Bellman error are guaranteed to co...
We address two open theoretical questions in Policy Gradient Reinforcement Learning. The first concerns the efficacy of using function approximation to represent the state action ...