Sciweavers

95 search results - page 3 / 19
» Policy Gradients for Cryptanalysis
Sort
View
NIPS
2008
13 years 7 months ago
Signal-to-Noise Ratio Analysis of Policy Gradient Algorithms
Policy gradient (PG) reinforcement learning algorithms have strong (local) convergence guarantees, but their learning performance is typically limited by a large variance in the e...
John W. Roberts, Russ Tedrake
IJCAI
2003
13 years 7 months ago
Covariant Policy Search
We investigate the problem of non-covariant behavior of policy gradient reinforcement learning algorithms. The policy gradient approach is amenable to analysis by information geom...
J. Andrew Bagnell, Jeff G. Schneider
AIPS
2007
13 years 8 months ago
Concurrent Probabilistic Temporal Planning with Policy-Gradients
We present an any-time concurrent probabilistic temporal planner that includes continuous and discrete uncertainties and metric functions. Our approach is a direct policy search t...
Douglas Aberdeen, Olivier Buffet
AAAI
2011
12 years 6 months ago
Policy Gradient Planning for Environmental Decision Making with Existing Simulators
In environmental and natural resource planning domains actions are taken at a large number of locations over multiple time periods. These problems have enormous state and action s...
Mark Crowley, David Poole
AAAI
2010
13 years 7 months ago
Relative Entropy Policy Search
Policy search is a successful approach to reinforcement learning. However, policy improvements often result in the loss of information. Hence, it has been marred by premature conv...
Jan Peters, Katharina Mülling, Yasemin Altun