policy iteration | Sciweavers

16

NN
2010
Springer

187views Neural Networks» more NN 2010»

Efficient exploration through active learning for value function approximation in reinforcement learning

12 years 11 months ago

Appropriately designing sampling policies is highly important for obtaining better control policies in reinforcement learning. In this paper, we first show that the least-squares ...

Takayuki Akiyama, Hirotaka Hachiya, Masashi Sugiya...

claim paper

Read More »

11

click to vote

CDC
2010
IEEE

139views Control Systems» more CDC 2010»

Q-learning and enhanced policy iteration in discounted dynamic programming

12 years 11 months ago

Download web.mit.edu

We consider the classical finite-state discounted Markovian decision problem, and we introduce a new policy iteration-like algorithm for finding the optimal state costs or Q-facto...

Dimitri P. Bertsekas, Huizhen Yu

claim paper

Read More »

14

click to vote

CDC
2010
IEEE

136views Control Systems» more CDC 2010»

Pathologies of temporal difference methods in approximate dynamic programming

12 years 11 months ago

Download web.mit.edu

Approximate policy iteration methods based on temporal differences are popular in practice, and have been tested extensively, dating to the early nineties, but the associated conve...

Dimitri P. Bertsekas

claim paper

Read More »

10

click to vote

AUTOMATICA
2008

74views more AUTOMATICA 2008»

Policy iteration based feedback control

13 years 4 months ago

Download www.cfins.au.tsinghua.edu.cn

It is well known that stochastic control systems can be viewed as Markov decision processes (MDPs) with continuous state spaces. In this paper, we propose to apply the policy iter...

Kan-Jian Zhang, Yan-Kai Xu, Xi Chen, Xi-Ren Cao

claim paper

Read More »

12

click to vote

NIPS
2003

180views Information Technology» more NIPS 2003»

Bounded Finite State Controllers

13 years 5 months ago

Download books.nips.cc

We describe a new approximation algorithm for solving partially observable MDPs. Our bounded policy iteration approach searches through the space of bounded-size, stochastic ﬁni...

Pascal Poupart, Craig Boutilier

claim paper

Read More »

18

click to vote

NIPS
2001

206views Information Technology» more NIPS 2001»

Model-Free Least-Squares Policy Iteration

13 years 6 months ago

Download www.cs.duke.edu

We propose a new approach to reinforcement learning which combines least squares function approximation with policy iteration. Our method is model-free and completely off policy. ...

Michail G. Lagoudakis, Ronald Parr

claim paper

Read More »

13

click to vote

AAAI
2006

146views Intelligent Agents» more AAAI 2006»

Incremental Least Squares Policy Iteration for POMDPs

13 years 6 months ago

Download www.aaai.org

We present a new algorithm, called incremental least squares policy iteration (ILSPI), for finding the infinite-horizon stationary policy for partially observable Markov decision ...

Hui Li, Xuejun Liao, Lawrence Carin

claim paper

Read More »

14

click to vote

UAI
2008

192views Artificial Intelligence» more UAI 2008»

Sparse Stochastic Finite-State Controllers for POMDPs

13 years 6 months ago

Download www.aaai.org

Bounded policy iteration is an approach to solving infinitehorizon POMDPs that represents policies as stochastic finitestate controllers and iteratively improves a controller by a...

Eric A. Hansen

claim paper

Read More »

15

click to vote

AAAI
2007

126views Intelligent Agents» more AAAI 2007»

Point-Based Policy Iteration

13 years 6 months ago

Download www.cs.duke.edu

We describe a point-based policy iteration (PBPI) algorithm for inﬁnite-horizon POMDPs. PBPI replaces the exact policy improvement step of Hansen’s policy iteration with point...

Shihao Ji, Ronald Parr, Hui Li, Xuejun Liao, Lawre...

claim paper

Read More »

18

click to vote

VALUETOOLS
2006
ACM

176views Hardware» more VALUETOOLS 2006»

How to solve large scale deterministic games with mean payoff by policy iteration

13 years 10 months ago

Download minimal.inria.fr

Min-max functions are dynamic programming operators of zero-sum deterministic games with ﬁnite state and action spaces. The problem of computing the linear growth rate of the or...

Vishesh Dhingra, Stephane Gaubert

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers