We model reinforcement learning as the problem of learning to control a Partially Observable Markov Decision Process ( ¢¡¤£¦¥§ ), and focus on gradient ascent approache...
Abstract –In this paper, a new problem, consensus estimation, is formulated, whose setting is complementary to the well-known CEO problem. In particular, a set of nodes are emplo...
This paper presents an adaptation of the standard quantum search technique to enable application within Dynamic Programming, in order to optimise a Markov Decision Process. This i...
We develop a hierarchical approach to planning for partially observable Markov decision processes (POMDPs) in which a policy is represented as a hierarchical finite-state control...
We propose a frameworkfor robot programming which allows the seamless integration of explicit agent programming with decision-theoretic planning. Specifically, the DTGolog model a...
Craig Boutilier, Raymond Reiter, Mikhail Soutchans...