Decision making based on the comparison of multiple criteria of two or more alternatives, is the subject of intensive research. In many decision making situations, a single criter...
Alfons Schuster, Werner Dubitzky, Philippe Lopes, ...
We model reinforcement learning as the problem of learning to control a Partially Observable Markov Decision Process ( ¢¡¤£¦¥§ ), and focus on gradient ascent approache...
In environmental and natural resource planning domains actions are taken at a large number of locations over multiple time periods. These problems have enormous state and action s...
We define stochastic timed games, which extend two-player timed games with probabilities (following a recent approach by Baier et al), and which extend in a natural way continuous-...
We present new algorithms for reinforcement learning, and prove that they have polynomial bounds on the resources required to achieve near-optimal return in general Markov decisio...