Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

20

ICML
2000
IEEE

favoriteEmaildiscussreport

165views Machine Learning» more ICML 2000»

A Bayesian Framework for Reinforcement Learning

13 years 12 months ago

A Bayesian Framework for Reinforcement Learning

Download www.ece.uvic.ca

The reinforcement learning problem can be decomposed into two parallel types of inference: (i) estimating the parameters of a model for the underlying process; (ii) determining behavior which maximizes return under the estimated model. Following Dearden, Friedman and Andre (1999), it is proposed that the learning process estimates online the full posterior distribution over models. To determine behavior, a hypothesis is sampled from this distribution and the greedy policy with respect to the hypothesis is obtained by dynamic programming. By using a different hypothesis for each trial appropriate exploratory and exploitative behavior is obtained. This Bayesian method always converges to the optimal policy for a stationary process with discrete states.

Malcolm J. A. Strens

Real-time Traffic

ICML 2000 | Machine Learning | Reinforcement Learning Problem | Trial Appropriate Exploratory | Underlying Process |

claim paper

Related Content

» Bayesian Reward Filtering

» A Bayesian Approach to Imitation in Reinforcement Learning

» Multitask reinforcement learning a hierarchical Bayesian approach

» Deictic Option Schemas

» ModelBased Bayesian Reinforcement Learning in Large Structured Domains

» Bayesian reinforcement learning for POMDPbased dialogue systems

» Bayesian multitask inverse reinforcement learning

» An analytic solution to discrete Bayesian reinforcement learning

» Sequential decision making in repeated coalition formation under uncertainty

Post Info
More Details (n/a)

Added	01 Aug 2010
Updated	01 Aug 2010
Type	Conference
Year	2000
Where	ICML
Authors	Malcolm J. A. Strens

Comments (0)