We consider an MDP setting in which the reward function is allowed to change during each time step of play (possibly in an adversarial manner), yet the dynamics remain fixed. Simi...
We consider decentralized control of Markov decision processes and give complexity bounds on the worst-case running time for algorithms that find optimal solutions. Generalization...
Daniel S. Bernstein, Shlomo Zilberstein, Neil Imme...
An issue that is critical for the application of Markov decision processes MDPs to realistic problems is how the complexity of planning scales with the size of the MDP. In stochas...
We consider a networked control system, where each subsystem evolves as a Markov decision process (MDP). Each subsystem is coupled to its neighbors via communication links over wh...
We introduce the generalized semi-Markov decision process (GSMDP) as an extension of continuous-time MDPs and semi-Markov decision processes (SMDPs) for modeling stochastic decisi...