Sciweavers

86
Voted
NIPS
2003
15 years 10 days ago
Policy Search by Dynamic Programming
We consider the policy search approach to reinforcement learning. We show that if a “baseline distribution” is given (indicating roughly how often we expect a good policy to v...
J. Andrew Bagnell, Sham Kakade, Andrew Y. Ng, Jeff...