Sciweavers

COLT
2000
Springer

87views Machine Learning» more COLT 2000»

Estimation and Approximation Bounds for Gradient-Based Reinforcement Learning

13 years 9 months ago

We model reinforcement learning as the problem of learning to control a Partially Observable Markov Decision Process ( ¢¡¤£¦¥§ ), and focus on gradient ascent approache...

Peter L. Bartlett, Jonathan Baxter

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers