Sciweavers

46 search results - page 10 / 10
» Breaking All Value Symmetries in Surjection Problems
Sort
View
ICML
2001
IEEE
14 years 6 months ago
Off-Policy Temporal Difference Learning with Function Approximation
We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
Doina Precup, Richard S. Sutton, Sanjoy Dasgupta