Sciweavers

NIPS
2004

VDCBPI: an Approximate Scalable Algorithm for Large POMDPs

13 years 5 months ago
VDCBPI: an Approximate Scalable Algorithm for Large POMDPs
Existing algorithms for discrete partially observable Markov decision processes can at best solve problems of a few thousand states due to two important sources of intractability: the curse of dimensionality and the policy space complexity. This paper describes a new algorithm (VDCBPI) that mitigates both sources of intractability by combining the Value Directed Compression (VDC) technique [13] with Bounded Policy Iteration (BPI) [14]. The scalability of VDCBPI is demonstrated on synthetic network management problems with up to 33 million states.
Pascal Poupart, Craig Boutilier
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2004
Where NIPS
Authors Pascal Poupart, Craig Boutilier
Comments (0)