Sciweavers

CDC
2009
IEEE

Arbitrarily modulated Markov decision processes

13 years 9 months ago
Arbitrarily modulated Markov decision processes
— We consider decision-making problems in Markov decision processes where both the rewards and the transition probabilities vary in an arbitrary (e.g., nonstationary) fashion. We propose an online Q-learning style algorithm and give a guarantee on its performance evaluated in retrospect against alternative policies. Unlike previous works, the guarantee depends critically on the variability of the uncertainty in the transition probabilities, but holds regardless of arbitrary changes in rewards and transition probabilities over time. Besides its intrinsic computational efficiency, this approach requires neither prior knowledge nor estimation of the transition probabilities.
Jia Yuan Yu, Shie Mannor
Added 21 Jul 2010
Updated 21 Jul 2010
Type Conference
Year 2009
Where CDC
Authors Jia Yuan Yu, Shie Mannor
Comments (0)