single sample path | Sciweavers

14

ML
2008
ACM

152views Machine Learning» more ML 2008»

Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path

13 years 4 months ago

Download hal.inria.fr

Abstract. We consider batch reinforcement learning problems in continuous space, expected total discounted-reward Markovian Decision Problems. As opposed to previous theoretical wo...

András Antos, Csaba Szepesvári, R&ea...

claim paper

Read More »

8

click to vote

COLT
2000
Springer

87views Machine Learning» more COLT 2000»

Estimation and Approximation Bounds for Gradient-Based Reinforcement Learning

13 years 8 months ago

Download www.cs.iastate.edu

We model reinforcement learning as the problem of learning to control a Partially Observable Markov Decision Process ( ¢¡¤£¦¥§ ), and focus on gradient ascent approache...

Peter L. Bartlett, Jonathan Baxter

claim paper

Read More »

11

click to vote

CDC
2009
IEEE

147views Control Systems» more CDC 2009»

A simulation-based method for aggregating Markov chains

13 years 9 months ago

Download mechse.illinois.edu

— This paper addresses model reduction for a Markov chain on a large state space. A simulation-based framework is introduced to perform state aggregation of the Markov chain base...

Kun Deng, Prashant G. Mehta, Sean P. Meyn

claim paper

Read More »

7

click to vote

ICML
2000
IEEE

126views Machine Learning» more ICML 2000»

Reinforcement Learning in POMDP's via Direct Gradient Ascent

14 years 5 months ago

Download reference.kfupm.edu.sa

This paper discusses theoretical and experimental aspects of gradient-based approaches to the direct optimization of policy performance in controlled ??? ?s. We introduce ??? ?, a...

Jonathan Baxter, Peter L. Bartlett

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers