Complexity of Probabilistic Planning under Average Rewards

9 years 1 months ago
Complexity of Probabilistic Planning under Average Rewards
A general and expressive model of sequential decision making under uncertainty is provided by the Markov decision processes (MDPs) framework. Complex applications with very large state spaces are best modelled implicitly (instead of explicitly by enumerating the state space), for example as precondition-effect operators, the representation used in AI planning. This kind of representations are very powerful, and they make the construction of policies/plans computationally very complex. In many applications, average rewards over unit time is the relevant rationality criterion, as opposed to the more widely used discounted reward criterion, and for providing a solid basis for the development of efficient planning algorithms, the computational complexity of the decision problems related to average rewards has to be analyzed. We investigate the complexity of the policy/plan existence problem for MDPs under the average reward criterion, with MDPs represented in terms of conditional probabil...
Jussi Rintanen
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2001
Authors Jussi Rintanen
Comments (0)