Sciweavers

132 search results - page 23 / 27
» Generalization in Reinforcement Learning: Safely Approximati...
Sort
View
ICML
2000
IEEE
16 years 12 days ago
Eligibility Traces for Off-Policy Policy Evaluation
Eligibility traces have been shown to speed reinforcement learning, to make it more robust to hidden states, and to provide a link between Monte Carlo and temporal-difference meth...
Doina Precup, Richard S. Sutton, Satinder P. Singh
NIPS
1993
15 years 27 days ago
Using Local Trajectory Optimizers to Speed Up Global Optimization in Dynamic Programming
Dynamic programming provides a methodology to develop planners and controllers for nonlinear systems. However, general dynamic programming is computationally intractable. We have ...
Christopher G. Atkeson
JAIR
1998
198views more  JAIR 1998»
14 years 11 months ago
Probabilistic Inference from Arbitrary Uncertainty using Mixtures of Factorized Generalized Gaussians
This paper presents a general and efficient framework for probabilistic inference and learning from arbitrary uncertain information. It exploits the calculation properties of fini...
Alberto Ruiz, Pedro E. López-de-Teruel, M. ...
ECAI
2006
Springer
15 years 3 months ago
Least Squares SVM for Least Squares TD Learning
Abstract. We formulate the problem of least squares temporal difference learning (LSTD) in the framework of least squares SVM (LS-SVM). To cope with the large amount (and possible ...
Tobias Jung, Daniel Polani
ICMLA
2004
15 years 1 months ago
Planning with predictive state representations
Predictive state representation (PSR) models for controlled dynamical systems have recently been proposed as an alternative to traditional models such as partially observable Mark...
Michael R. James, Satinder P. Singh, Michael L. Li...