Sciweavers

89
Voted
NECO
2010
97views more  NECO 2010»
14 years 9 months ago
Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning
Most conventional Policy Gradient Reinforcement Learning (PGRL) algorithms neglect (or do not explicitly make use of) a term in the average reward gradient with respect to the pol...
Tetsuro Morimura, Eiji Uchibe, Junichiro Yoshimoto...
IJCAI
2003
15 years 6 days ago
Covariant Policy Search
We investigate the problem of non-covariant behavior of policy gradient reinforcement learning algorithms. The policy gradient approach is amenable to analysis by information geom...
J. Andrew Bagnell, Jeff G. Schneider
NIPS
2001
15 years 6 days ago
Rates of Convergence of Performance Gradient Estimates Using Function Approximation and Bias in Reinforcement Learning
We address two open theoretical questions in Policy Gradient Reinforcement Learning. The first concerns the efficacy of using function approximation to represent the state action ...
Gregory Z. Grudic, Lyle H. Ungar
76
Voted
ECAI
2008
Springer
15 years 18 days ago
Exploiting locality of interactions using a policy-gradient approach in multiagent learning
In this paper, we propose a policy gradient reinforcement learning algorithm to address transition-independent Dec-POMDPs. This approach aims at implicitly exploiting the locality...
Francisco S. Melo
ICML
2003
IEEE
15 years 11 months ago
Model-based Policy Gradient Reinforcement Learning
Xin Wang, Thomas G. Dietterich