A Bayesian Framework for Reinforcement Learning

13 years 8 months ago
A Bayesian Framework for Reinforcement Learning
The reinforcement learning problem can be decomposed into two parallel types of inference: (i) estimating the parameters of a model for the underlying process; (ii) determining behavior which maximizes return under the estimated model. Following Dearden, Friedman and Andre (1999), it is proposed that the learning process estimates online the full posterior distribution over models. To determine behavior, a hypothesis is sampled from this distribution and the greedy policy with respect to the hypothesis is obtained by dynamic programming. By using a different hypothesis for each trial appropriate exploratory and exploitative behavior is obtained. This Bayesian method always converges to the optimal policy for a stationary process with discrete states.
Malcolm J. A. Strens
Added 01 Aug 2010
Updated 01 Aug 2010
Type Conference
Year 2000
Where ICML
Authors Malcolm J. A. Strens
Comments (0)