We consider a networked control system, where each subsystem evolves as a Markov decision process (MDP). Each subsystem is coupled to its neighbors via communication links over wh...
For a Markov Decision Process with finite state (size S) and action spaces (size A per state), we propose a new algorithm--Delayed Q-Learning. We prove it is PAC, achieving near o...
Alexander L. Strehl, Lihong Li, Eric Wiewiora, Joh...
We consider decentralized control of Markov decision processes and give complexity bounds on the worst-case running time for algorithms that find optimal solutions. Generalization...
Daniel S. Bernstein, Shlomo Zilberstein, Neil Imme...
This paper presents a novel framework for simultaneously learning representation and control in continuous Markov decision processes. Our approach builds on the framework of proto...