Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

11

ICML
2005
IEEE

favoriteEmaildiscussreport

100views Machine Learning» more ICML 2005»

Reinforcement learning with Gaussian processes

14 years 5 months ago

Reinforcement learning with Gaussian processes

Download www.machinelearning.org

Gaussian Process Temporal Difference (GPTD) learning offers a Bayesian solution to the policy evaluation problem of reinforcement learning. In this paper we extend the GPTD framework by addressing two pressing issues, which were not adequately treated in the original GPTD paper (Engel et al., 2003). The first is the issue of stochasticity in the state transitions, and the second is concerned with action selection and policy improvement. We present a new generative model for the value function, deduced from its relation with the discounted return. We derive a corresponding on-line algorithm for learning the posterior moments of the value Gaussian process. We also present a SARSA based extension of GPTD, termed GPSARSA, that allows the selection of actions and the gradual improvement of policies without requiring a world-model.

Yaakov Engel, Shie Mannor, Ron Meir

Real-time Traffic

ICML 2005 | Machine Learning | Policy Evaluation Problem | Process Temporal Difference | Value Gaussian Process |

claim paper

Related Content

» Gaussian Processes in Reinforcement Learning

» Graph Kernels and Gaussian Processes for Relational Reinforcement Learning

» Gaussian Processes for Sample Efficient Reinforcement Learning with RMAXLike Exploration

» Bayesian reinforcement learning in continuous POMDPs with gaussian processes

» Gaussian Processes and Reinforcement Learning for Identification and Control of an Autonom...

» Adaptive autonomous control using online value iteration with gaussian processes

» Improving humanoid locomotive performance with learnt approximated dynamics via Gaussian p...

» Reinforcement learning agents with primary knowledge designed by analytic hierarchy proces...

» Autonomous blimp control using modelfree reinforcement learning in a continuous state and ...

Post Info
More Details (n/a)

Added	17 Nov 2009
Updated	17 Nov 2009
Type	Conference
Year	2005
Where	ICML
Authors	Yaakov Engel, Shie Mannor, Ron Meir

Comments (0)