Sciweavers

33 search results - page 1 / 7
» Pathologies of temporal difference methods in approximate dy...
Sort
View
CDC
2010
IEEE
136views Control Systems» more  CDC 2010»
12 years 11 months ago
Pathologies of temporal difference methods in approximate dynamic programming
Approximate policy iteration methods based on temporal differences are popular in practice, and have been tested extensively, dating to the early nineties, but the associated conve...
Dimitri P. Bertsekas
ICML
2006
IEEE
13 years 10 months ago
Automatic basis function construction for approximate dynamic programming and reinforcement learning
We address the problem of automatically constructing basis functions for linear approximation of the value function of a Markov Decision Process (MDP). Our work builds on results ...
Philipp W. Keller, Shie Mannor, Doina Precup
ICML
1995
IEEE
14 years 5 months ago
Stable Function Approximation in Dynamic Programming
The success ofreinforcement learninginpractical problems depends on the ability to combine function approximation with temporal di erence methods such as value iteration. Experime...
Geoffrey J. Gordon
3DPVT
2006
IEEE
203views Visualization» more  3DPVT 2006»
13 years 10 months ago
A Spatio-Temporal Modeling Method for Shape Representation
The spherical harmonic (SPHARM) description is a powerful surface modeling technique that can model arbitrarily shaped but simply connected three dimensional (3D) objects. Because...
Heng Huang, Li Shen, Rong Zhang, Fillia Makedon, J...
ATAL
2005
Springer
13 years 10 months ago
Improving reinforcement learning function approximators via neuroevolution
Reinforcement learning problems are commonly tackled with temporal difference methods, which use dynamic programming and statistical sampling to estimate the long-term value of ta...
Shimon Whiteson