Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

102

ICML
2006
IEEE

favoriteEmaildiscussreport

103views Machine Learning» more ICML 2006»

Using inaccurate models in reinforcement learning

16 years 2 months ago

Using inaccurate models in reinforcement learning

Download ai.stanford.edu

In the model-based policy search approach to reinforcement learning (RL), policies are found using a model (or "simulator") of the Markov decision process. However, for highdimensional continuous-state tasks, it can be extremely difficult to build an accurate model, and thus often the algorithm returns a policy that works in simulation but not in real-life. The other extreme, model-free RL, tends to require infeasibly large numbers of real-life trials. In this paper, we present a hybrid algorithm that requires only an approximate model, and only a small number of real-life trials. The key idea is to successively "ground" the policy evaluations using real-life trials, but to rely on the approximate model to suggest local changes. Our theoretical results show that this algorithm achieves near-optimal performance in the real system, even when the model is only approximate. Empirical results also demonstrate that--when given only a crude model and a small number of rea...

Pieter Abbeel, Morgan Quigley, Andrew Y. Ng

Real-time Traffic

Approximate Model | ICML 2006 | Machine Learning | Near-optimal Performance | Real-life Trials--our Algorithm |

claim paper

Related Content

» Generalized model learning for reinforcement learning in factored domains

» RMAX A General Polynomial Time Algorithm for NearOptimal Reinforcement Learning

» Training Reinforcement Neurocontrollers Using the Polytope Algorithm

» Reinforcement learning agents with primary knowledge designed by analytic hierarchy proces...

» Reinforcement learning with limited reinforcement using Bayes risk for active learning in ...

» Combining ModelBased MetaReasoning and Reinforcement Learning for Adapting GamePlaying Age...

» Optimizing time warp simulation with reinforcement learning techniques

» ModelBased Reinforcement Learning in a Complex Domain

» Rule value reinforcement learning for cognitive agents

Post Info
More Details (n/a)

Added	17 Nov 2009
Updated	17 Nov 2009
Type	Conference
Year	2006
Where	ICML
Authors	Pieter Abbeel, Morgan Quigley, Andrew Y. Ng

Comments (0)