policy evaluation problem

218

ICML
2000
IEEE

153views Machine Learning» more ICML 2000»

Eligibility Traces for Off-Policy Policy Evaluation

16 years 8 months ago

Eligibility traces have been shown to speed reinforcement learning, to make it more robust to hidden states, and to provide a link between Monte Carlo and temporal-difference meth...

Doina Precup, Richard S. Sutton, Satinder P. Singh

claim paper

Read More »

197

click to vote

ICML
2005
IEEE

100views Machine Learning» more ICML 2005»

Reinforcement learning with Gaussian processes

16 years 8 months ago

Download www.machinelearning.org

Gaussian Process Temporal Difference (GPTD) learning offers a Bayesian solution to the policy evaluation problem of reinforcement learning. In this paper we extend the GPTD framew...

Yaakov Engel, Shie Mannor, Ron Meir

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers