Model-Based Average Reward Reinforcement Learning

10 years 1 months ago
Model-Based Average Reward Reinforcement Learning
Reinforcement Learning (RL) is the study of programs that improve their performance by receiving rewards and punishments from the environment. Most RL methods optimize the discounted total reward received by an agent, while, in many domains, the natural criterion is to optimize the average reward per time step. In this paper, we introduce a model-based Average-reward Reinforcement Learning method called H-learning and show that it converges more quickly and robustly than its discounted counterpart in the domain of scheduling a simulated Automatic Guided Vehicle (AGV). We also introduce a version of H-learning that automatically explores the unexplored parts of the state space, while always choosing greedy actions with respect to the current value function. We show that this \Auto-exploratory H-Learning" performs better than the previously studied exploration strategies. To scale H-learning to larger state spaces, we extend it to learn action models and reward functions in the for...
Prasad Tadepalli, DoKyeong Ok
Added 21 Dec 2010
Updated 21 Dec 2010
Type Journal
Year 1998
Where AI
Authors Prasad Tadepalli, DoKyeong Ok
Comments (0)