Sciweavers

147 search results - page 1 / 30
» Policy Gradient in Continuous Time
Sort
View
JMLR
2006
124views more  JMLR 2006»
13 years 4 months ago
Policy Gradient in Continuous Time
Policy search is a method for approximately solving an optimal control problem by performing a parametric optimization search in a given class of parameterized policies. In order ...
Rémi Munos
ICANNGA
2007
Springer
105views Algorithms» more  ICANNGA 2007»
13 years 10 months ago
Reinforcement Learning in Fine Time Discretization
Reinforcement Learning (RL) is analyzed here as a tool for control system optimization. State and action spaces are assumed to be continuous. Time is assumed to be discrete, yet th...
Pawel Wawrzynski
ICML
2008
IEEE
14 years 5 months ago
Non-parametric policy gradients: a unified treatment of propositional and relational domains
Policy gradient approaches are a powerful instrument for learning how to interact with the environment. Existing approaches have focused on propositional and continuous domains on...
Kristian Kersting, Kurt Driessens
CORR
2006
Springer
113views Education» more  CORR 2006»
13 years 4 months ago
A Unified View of TD Algorithms; Introducing Full-Gradient TD and Equi-Gradient Descent TD
This paper addresses the issue of policy evaluation in Markov Decision Processes, using linear function approximation. It provides a unified view of algorithms such as TD(), LSTD()...
Manuel Loth, Philippe Preux
SIAMCO
2008
112views more  SIAMCO 2008»
13 years 4 months ago
A Knowledge-Gradient Policy for Sequential Information Collection
In a sequential Bayesian ranking and selection problem with independent normal populations and common known variance, we study a previously introduced measurement policy which we ...
Peter Frazier, Warren B. Powell, Savas Dayanik