Sciweavers

AAAI
2006

Compact, Convex Upper Bound Iteration for Approximate POMDP Planning

13 years 6 months ago
Compact, Convex Upper Bound Iteration for Approximate POMDP Planning
Partially observable Markov decision processes (POMDPs) are an intuitive and general way to model sequential decision making problems under uncertainty. Unfortunately, even approximate planning in POMDPs is known to be hard, and developing heuristic planners that can deliver reasonable results in practice has proved to be a significant challenge. In this paper, we present a new approach to approximate value-iteration for POMDP planning that is based on quadratic rather than piecewise linear function approximators. Specifically, we approximate the optimal value function by a convex upper bound composed of a fixed number of quadratics, and optimize it at each stage by semidefinite programming. We demonstrate that our approach can achieve competitive approximation quality to current techniques while still maintaining a bounded size representation of the function approximator. Moreover, an upper bound on the optimal value function can be preserved if required. Overall, the technique requi...
Tao Wang, Pascal Poupart, Michael H. Bowling, Dale
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2006
Where AAAI
Authors Tao Wang, Pascal Poupart, Michael H. Bowling, Dale Schuurmans
Comments (0)