Point-Based Policy Iteration

15 years 8 months ago

Download www.cs.duke.edu

We describe a point-based policy iteration (PBPI) algorithm for inﬁnite-horizon POMDPs. PBPI replaces the exact policy improvement step of Hansen’s policy iteration with point-based value iteration (PBVI). Despite being an approximate algorithm, PBPI is monotonic: At each iteration before convergence, PBPI produces a policy for which the values increase for at least one of a ﬁnite set of initial belief states, and decrease for none of these states. In contrast, PBVI cannot guarantee monotonic improvement of the value function or the policy. In practice PBPI generally needs a lower density of point coverage in the simplex and tends to produce superior policies with less computation. Experiments on several benchmark problems (up to 12,545 states) demonstrate the scalability and robustness of the PBPI algorithm.

Shihao Ji, Ronald Parr, Hui Li, Xuejun Liao, Lawre

Real-time Traffic

AAAI 2007 | Hansen’s Policy Iteration | Intelligent Agents | Point-based Policy Iteration | Policy Iteration |

claim paper

» Improving Anytime PointBased Value Iteration Using Principled Point Selections

» PointBased Value Iteration for Continuous POMDPs

» Belief Selection in PointBased Planning Algorithms for POMDPs

» Anytime PointBased Approximations for Large POMDPs

» Shape Analysis Using a PointBased Statistical Shape Model Built on Correspondence Probabil...

» Prioritizing PointBased POMDP Solvers

» Paging and Registration in Cellular Networks Jointly Optimal Policies and an Iterative Alg...

» Learning nearoptimal policies with Bellmanresidual minimization based fitted policy iterat...

Post Info
More Details (n/a)

Added	02 Oct 2010
Updated	02 Oct 2010
Type	Conference
Year	2007
Where	AAAI
Authors	Shihao Ji, Ronald Parr, Hui Li, Xuejun Liao, Lawrence Carin

Comments (0)

Sciweavers

Point-Based Policy Iteration

AAAI 2007 | Hansen’s Policy Iteration | Intelligent Agents | Point-based Policy Iteration | Policy Iteration |

Explore & Download

Productivity Tools

Sciweavers