Sciweavers

95 search results - page 2 / 19
» Policy Gradients for Cryptanalysis
Sort
View
UAI
2008
13 years 6 months ago
Improving Gradient Estimation by Incorporating Sensor Data
An efficient policy search algorithm should estimate the local gradient of the objective function, with respect to the policy parameters, from as few trials as possible. Whereas m...
Gregory Lawrence, Stuart J. Russell
SIAMCO
2008
112views more  SIAMCO 2008»
13 years 5 months ago
A Knowledge-Gradient Policy for Sequential Information Collection
In a sequential Bayesian ranking and selection problem with independent normal populations and common known variance, we study a previously introduced measurement policy which we ...
Peter Frazier, Warren B. Powell, Savas Dayanik
ICML
2009
IEEE
14 years 6 months ago
Predictive representations for policy gradient in POMDPs
We consider the problem of estimating the policy gradient in Partially Observable Markov Decision Processes (POMDPs) with a special class of policies that are based on Predictive ...
Abdeslam Boularias, Brahim Chaib-draa
ECML
2005
Springer
13 years 10 months ago
Natural Actor-Critic
This paper investigates a novel model-free reinforcement learning architecture, the Natural Actor-Critic. The actor updates are based on stochastic policy gradients employing Amari...
Jan Peters, Sethu Vijayakumar, Stefan Schaal
ICML
2003
IEEE
14 years 6 months ago
Model-based Policy Gradient Reinforcement Learning
Xin Wang, Thomas G. Dietterich