We propose a new approach to the problem of searching a space of policies for a Markov decision process (MDP) or a partially observable Markov decision process (POMDP), given a mo...
Markov Decision Processes (MDP) have been widely used as a framework for planning under uncertainty. They allow to compute optimal sequences of actions in order to achieve a given...
Agents often have to construct plans that obey deadlines or, more generally, resource limits for real-valued resources whose consumption can only be characterized by probability d...
We address the problem of computing an optimal value function for Markov decision processes. Since finding this function quickly and accurately requires substantial computation ef...
— We consider decision making in a Markovian setup where the reward parameters are not known in advance. Our performance criterion is the gap between the performance of the best ...