Sciweavers

ICML
2000
IEEE
15 years 10 months ago
Eligibility Traces for Off-Policy Policy Evaluation
Eligibility traces have been shown to speed reinforcement learning, to make it more robust to hidden states, and to provide a link between Monte Carlo and temporal-difference meth...
Doina Precup, Richard S. Sutton, Satinder P. Singh
88
Voted
ICML
2005
IEEE
15 years 10 months ago
Reinforcement learning with Gaussian processes
Gaussian Process Temporal Difference (GPTD) learning offers a Bayesian solution to the policy evaluation problem of reinforcement learning. In this paper we extend the GPTD framew...
Yaakov Engel, Shie Mannor, Ron Meir