Sciweavers

204

CDC
2009
IEEE

132views Control Systems» more CDC 2009»

Q-learning and Pontryagin's Minimum Principle

16 years 9 days ago

Abstract— Q-learning is a technique used to compute an optimal policy for a controlled Markov chain based on observations of the system controlled using a non-optimal policy. It ...

Prashant G. Mehta, Sean P. Meyn

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers