Planning in partially-observable dynamical systems (such as POMDPs and PSRs) is a computationally challenging task. Popular approximation techniques that have proved successful ar...
Michael R. James, Michael E. Samples, Dmitri A. Do...
We develop a point based method for solving finitely nested interactive POMDPs approximately. Analogously to point based value iteration (PBVI) in POMDPs, we maintain a set of bel...
Concurrent reachability games is a class of games heavily studied by the computer science community, in particular by the formal methods community. Two standard algorithms for app...
In this paper, we develop a stochastic approximation method to solve a monotone estimation problem and use this method to enhance the empirical performance of the Q-learning algor...
In this paper we develop a theoretical analysis of the performance of sampling-based fitted value iteration (FVI) to solve infinite state-space, discounted-reward Markovian decisi...