Sciweavers

ALT
2007
Springer

Pseudometrics for State Aggregation in Average Reward Markov Decision Processes

14 years 1 months ago
Pseudometrics for State Aggregation in Average Reward Markov Decision Processes
We consider how state similarity in average reward Markov decision processes (MDPs) may be described by pseudometrics. Introducing the notion of adequate pseudometrics which are well adapted to the structure of the MDP, we show how these may be used for state aggregation. Upper bounds on the loss that may be caused by working on the aggregated instead of the original MDP are given and compared to the bounds that have been achieved for discounted reward MDPs.
Ronald Ortner
Added 14 Mar 2010
Updated 14 Mar 2010
Type Conference
Year 2007
Where ALT
Authors Ronald Ortner
Comments (0)