Sciweavers

147 search results - page 3 / 30
» Policy Gradient in Continuous Time
Sort
View
AAAI
2011
12 years 4 months ago
Policy Gradient Planning for Environmental Decision Making with Existing Simulators
In environmental and natural resource planning domains actions are taken at a large number of locations over multiple time periods. These problems have enormous state and action s...
Mark Crowley, David Poole
ICANN
2007
Springer
13 years 11 months ago
Solving Deep Memory POMDPs with Recurrent Policy Gradients
Abstract. This paper presents Recurrent Policy Gradients, a modelfree reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov...
Daan Wierstra, Alexander Förster, Jan Peters,...
EWRL
2008
13 years 6 months ago
Policy Learning - A Unified Perspective with Applications in Robotics
Policy Learning approaches are among the best suited methods for high-dimensional, continuous control systems such as anthropomorphic robot arms and humanoid robots. In this paper,...
Jan Peters, Jens Kober, Duy Nguyen-Tuong
EOR
2011
112views more  EOR 2011»
12 years 11 months ago
Continuous time mean variance asset allocation: A time-consistent strategy
We develop a numerical scheme for determining the optimal asset allocation strategy for time-consistent, continuous time, mean variance optimization. Any type of constraint can be...
J. Wang, P. A. Forsyth
DEDS
2010
97views more  DEDS 2010»
13 years 4 months ago
On Regression-Based Stopping Times
We study approaches that fit a linear combination of basis functions to the continuation value function of an optimal stopping problem and then employ a greedy policy based on the...
Benjamin Van Roy