Sciweavers

5 search results - page 1 / 1
» A worst-case comparison between temporal difference and resi...
Sort
View
ICML
2008
IEEE
14 years 5 months ago
A worst-case comparison between temporal difference and residual gradient with linear function approximation
Residual gradient (RG) was proposed as an alternative to TD(0) for policy evaluation when function approximation is used, but there exists little formal analysis comparing them ex...
Lihong Li
ICML
2001
IEEE
14 years 5 months ago
Off-Policy Temporal Difference Learning with Function Approximation
We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
Doina Precup, Richard S. Sutton, Sanjoy Dasgupta
ML
2002
ACM
168views Machine Learning» more  ML 2002»
13 years 4 months ago
On Average Versus Discounted Reward Temporal-Difference Learning
We provide an analytical comparison between discounted and average reward temporal-difference (TD) learning with linearly parameterized approximations. We first consider the asympt...
John N. Tsitsiklis, Benjamin Van Roy
CORR
2010
Springer
204views Education» more  CORR 2010»
13 years 3 months ago
Predictive State Temporal Difference Learning
We propose a new approach to value function approximation which combines linear temporal difference reinforcement learning with subspace identification. In practical applications...
Byron Boots, Geoffrey J. Gordon
IPMI
2005
Springer
13 years 10 months ago
Robust Nonrigid Multimodal Image Registration Using Local Frequency Maps
Automatic multi-modal image registration is central to numerous tasks in medical imaging today and has a vast range of applications e.g., image guidance, atlas construction, etc. ...
Bing Jian, Baba C. Vemuri, José L. Marroqu&...