Sciweavers

672 search results - page 1 / 135
» Policy Search by Dynamic Programming
Sort
View
NIPS
2003
13 years 6 months ago
Policy Search by Dynamic Programming
We consider the policy search approach to reinforcement learning. We show that if a “baseline distribution” is given (indicating roughly how often we expect a good policy to v...
J. Andrew Bagnell, Sham Kakade, Andrew Y. Ng, Jeff...
FLAIRS
2007
13 years 7 months ago
Performance Analysis of Evolutionary Search with a Dynamic Restart Policy
In this work we explore how the complexity of a problem domain affects the performance of evolutionary search using a performance-based restart policy. Previous research indicates...
Michael Solano, Istvan Jonyer
AAAI
2006
13 years 6 months ago
Focused Real-Time Dynamic Programming for MDPs: Squeezing More Out of a Heuristic
Real-time dynamic programming (RTDP) is a heuristic search algorithm for solving MDPs. We present a modified algorithm called Focused RTDP with several improvements. While RTDP ma...
Trey Smith, Reid G. Simmons
ICML
2008
IEEE
14 years 6 months ago
Space-indexed dynamic programming: learning to follow trajectories
We consider the task of learning to accurately follow a trajectory in a vehicle such as a car or helicopter. A number of dynamic programming algorithms such as Differential Dynami...
J. Zico Kolter, Adam Coates, Andrew Y. Ng, Yi Gu, ...
CDC
2010
IEEE
136views Control Systems» more  CDC 2010»
13 years 8 days ago
Pathologies of temporal difference methods in approximate dynamic programming
Approximate policy iteration methods based on temporal differences are popular in practice, and have been tested extensively, dating to the early nineties, but the associated conve...
Dimitri P. Bertsekas