Recent decision-theoric planning algorithms are able to find optimal solutions in large problems, using Factored Markov Decision Processes (fmdps). However, these algorithms need ...
Thomas Degris, Olivier Sigaud, Pierre-Henri Wuille...
For a Markov Decision Process with finite state (size S) and action spaces (size A per state), we propose a new algorithm--Delayed Q-Learning. We prove it is PAC, achieving near o...
Alexander L. Strehl, Lihong Li, Eric Wiewiora, Joh...
For a given problem, the optimal Markov policy over a finite horizon is a conditional plan containing a potentially large number of branches. However, there are applications wher...
— We introduce the Oracular Partially Observable Markov Decision Process (OPOMDP), a type of POMDP in which the world produces no observations; instead there is an “oracle,” ...
Many stochastic planning problems can be represented using Markov Decision Processes (MDPs). A difficulty with using these MDP representations is that the common algorithms for so...