Reinforcement learning models generally assume that a stimulus is presented that allows a learner to unambiguously identify the state of nature, and the reward received is drawn f...
Tobias Larsen, David S. Leslie, Edmund J. Collins,...
Model learning combined with dynamic programming has been shown to be e ective for learning control of continuous state dynamic systems. The simplest method assumes the learned mod...
Markov decision processes (MDPs) are an established framework for solving sequential decision-making problems under uncertainty. In this work, we propose a new method for batchmod...
— We consider the problem of optimal control in continuous and partially observable environments when the parameters of the model are not known exactly. Partially Observable Mark...
Abstract. The Factored Markov Decision Process (FMDP) framework is a standard representation for sequential decision problems under uncertainty where the state is represented as a ...
Olga Kozlova, Olivier Sigaud, Pierre-Henri Wuillem...