Search Sciweavers | Sciweavers

672 search results - page 1 / 135

» Policy Search by Dynamic Programming

147

click to vote

NIPS
2003

108views Information Technology» more NIPS 2003»

Policy Search by Dynamic Programming

15 years 7 months ago

Download books.nips.cc

We consider the policy search approach to reinforcement learning. We show that if a “baseline distribution” is given (indicating roughly how often we expect a good policy to v...

J. Andrew Bagnell, Sham Kakade, Andrew Y. Ng, Jeff...

claim paper

Read More »

143

click to vote

FLAIRS
2007

125views Artificial Intelligence» more FLAIRS 2007»

Performance Analysis of Evolutionary Search with a Dynamic Restart Policy

15 years 8 months ago

Download www.aaai.org

In this work we explore how the complexity of a problem domain affects the performance of evolutionary search using a performance-based restart policy. Previous research indicates...

Michael Solano, Istvan Jonyer

claim paper

Read More »

144

click to vote

AAAI
2006

121views Intelligent Agents» more AAAI 2006»

Focused Real-Time Dynamic Programming for MDPs: Squeezing More Out of a Heuristic

15 years 7 months ago

Download www.cs.cmu.edu

Real-time dynamic programming (RTDP) is a heuristic search algorithm for solving MDPs. We present a modified algorithm called Focused RTDP with several improvements. While RTDP ma...

Trey Smith, Reid G. Simmons

claim paper

Read More »

142

click to vote

ICML
2008
IEEE

133views Machine Learning» more ICML 2008»

Space-indexed dynamic programming: learning to follow trajectories

16 years 6 months ago

Download www.cs.stanford.edu

We consider the task of learning to accurately follow a trajectory in a vehicle such as a car or helicopter. A number of dynamic programming algorithms such as Differential Dynami...

J. Zico Kolter, Adam Coates, Andrew Y. Ng, Yi Gu, ...

claim paper

Read More »

170

click to vote

CDC
2010
IEEE

136views Control Systems» more CDC 2010»

Pathologies of temporal difference methods in approximate dynamic programming

15 years 1 months ago

Download web.mit.edu

Approximate policy iteration methods based on temporal differences are popular in practice, and have been tested extensively, dating to the early nineties, but the associated conve...

Dimitri P. Bertsekas

claim paper

Read More »

« Prev « First page 1 / 135 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers