— We introduce the Oracular Partially Observable Markov Decision Process (OPOMDP), a type of POMDP in which the world produces no observations; instead there is an “oracle,” ...
Stochastic games generalize Markov decision processes MDPs to a multiagent setting by allowing the state transitions to depend jointly on all player actions, and having rewards de...
Michael J. Kearns, Yishay Mansour, Satinder P. Sin...
We study a sequential variance reduction technique for Monte Carlo estimation of functionals in Markov Chains. The method is based on designing sequential control variates using s...
We propose a method of approximate dynamic programming for Markov decision processes (MDPs) using algebraic decision diagrams (ADDs). We produce near-optimal value functions and p...
—We propose a dynamic spectrum access scheme where secondary users recommend “good” channels to each other and access accordingly. We formulate the problem as an average rewa...