Sciweavers

40 search results - page 1 / 8
» Parametric regret in uncertain Markov decision processes
Sort
View
CDC
2009
IEEE
169views Control Systems» more  CDC 2009»
13 years 9 months ago
Parametric regret in uncertain Markov decision processes
— We consider decision making in a Markovian setup where the reward parameters are not known in advance. Our performance criterion is the gap between the performance of the best ...
Huan Xu, Shie Mannor
AAAI
2010
13 years 6 months ago
Robust Policy Computation in Reward-Uncertain MDPs Using Nondominated Policies
The precise specification of reward functions for Markov decision processes (MDPs) is often extremely difficult, motivating research into both reward elicitation and the robust so...
Kevin Regan, Craig Boutilier
ALT
2008
Springer
14 years 1 months ago
Online Regret Bounds for Markov Decision Processes with Deterministic Transitions
Abstract. We consider an upper confidence bound algorithm for Markov decision processes (MDPs) with deterministic transitions. For this algorithm we derive upper bounds on the onl...
Ronald Ortner
EWRL
2008
13 years 6 months ago
Markov Decision Processes with Arbitrary Reward Processes
Abstract. We consider a control problem where the decision maker interacts with a standard Markov decision process with the exception that the reward functions vary arbitrarily ove...
Jia Yuan Yu, Shie Mannor, Nahum Shimkin
ICML
2007
IEEE
14 years 5 months ago
Percentile optimization in uncertain Markov decision processes with application to efficient exploration
Markov decision processes are an effective tool in modeling decision-making in uncertain dynamic environments. Since the parameters of these models are typically estimated from da...
Erick Delage, Shie Mannor