Search Sciweavers | Sciweavers

200 search results - page 5 / 40

» Point-Based Policy Iteration

165

click to vote

ESOP
2007
Springer

152views Programming Languages» more ESOP 2007»

Static Analysis by Policy Iteration on Relational Domains

16 years 25 days ago

Download minimal.inria.fr

We give a new practical algorithm to compute, in ﬁnite time, a ﬁxpoint (and often the least ﬁxpoint) of a system of equations in the abstract numerical domains of zones and t...

Stephane Gaubert, Eric Goubault, Ankur Taly, Sarah...

claim paper

Read More »

173

click to vote

JMLR
2002

100views more JMLR 2002»

On the Convergence of Optimistic Policy Iteration

15 years 6 months ago

Download www.mit.edu

We consider a finite-state Markov decision problem and establish the convergence of a special case of optimistic policy iteration that involves Monte Carlo estimation of Q-values,...

John N. Tsitsiklis

claim paper

Read More »

169

click to vote

ICRA
2009
IEEE

143views Robotics» more ICRA 2009»

Least absolute policy iteration for robust value function approximation

16 years 1 months ago

Download sugiyama-www.cs.titech.ac.jp

Abstract— Least-squares policy iteration is a useful reinforcement learning method in robotics due to its computational efﬁciency. However, it tends to be sensitive to outliers...

Masashi Sugiyama, Hirotaka Hachiya, Hisashi Kashim...

claim paper

Read More »

193

click to vote

NIPS
2008

165views Information Technology» more NIPS 2008»

Regularized Policy Iteration

15 years 8 months ago

Download webdocs.cs.ualberta.ca

In this paper we consider approximate policy-iteration-based reinforcement learning algorithms. In order to implement a flexible function approximation scheme we propose the use o...

Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csab...

claim paper

Read More »

198

click to vote

NIPS
2001

206views Information Technology» more NIPS 2001»

Model-Free Least-Squares Policy Iteration

15 years 8 months ago

Download www.cs.duke.edu

We propose a new approach to reinforcement learning which combines least squares function approximation with policy iteration. Our method is model-free and completely off policy. ...

Michail G. Lagoudakis, Ronald Parr

claim paper

Read More »

« Prev « First page 5 / 40 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers