Although tabular reinforcement learning (RL) methods have been proved to converge to an optimal policy, the combination of particular conventional reinforcement learning techniques...
This paper introduces a gradient-based reward prediction update mechanism to the XCS classifier system as applied in neuralnetwork type learning and function approximation mechani...
Martin V. Butz, David E. Goldberg, Pier Luca Lanzi
Planning problems are often formulated as heuristic search. The choice of the heuristic function plays a significant role in the performance of planning systems, but a good heuris...
Abstract— Continuous action sets are used in many reinforcement learning (RL) applications in robot control since the control input is continuous. However, discrete action sets a...
Akihiko Yamaguchi, Jun Takamatsu, Tsukasa Ogasawar...
This paper presents a novel framework called proto-reinforcement learning (PRL), based on a mathematical model of a proto-value function: these are task-independent basis function...