We describe an algorithm for learning in the presence of multiple criteria. Our technique generalizes previous approaches in that it can learn optimal policies for all linear pref...
This paper introduces a gradient-based reward prediction update mechanism to the XCS classifier system as applied in neuralnetwork type learning and function approximation mechani...
Martin V. Butz, David E. Goldberg, Pier Luca Lanzi
Abstract--Reinforcement learning (RL) research typically develops algorithms for helping an RL agent best achieve its goals-however they came to be defined--while ignoring the rela...
In this paper, we propose a novel adaptive step-size approach for policy gradient reinforcement learning. A new metric is defined for policy gradients that measures the effect of ...
Takamitsu Matsubara, Tetsuro Morimura, Jun Morimot...
Reinforcement Learning research is traditionally devoted to solve single-task problems. Therefore, anytime a new task is faced, learning must be restarted from scratch. Recently, ...