Research in reinforcementlearning (RL)has thus far concentrated on two optimality criteria: the discounted framework, which has been very well-studied, and the averagereward frame...
Learning the reward function of an agent by observing its behavior is termed inverse reinforcement learning and has applications in learning from demonstration or apprenticeship l...
Reinforcement learning in real-world domains suffers from three curses of dimensionality: explosions in state and action spaces, and high stochasticity. We present approaches that ...
We present an efficient "sparse sampling" technique for approximating Bayes optimal decision making in reinforcement learning, addressing the well known exploration vers...
Tao Wang, Daniel J. Lizotte, Michael H. Bowling, D...
The reward functions that drive reinforcement learning systems are generally derived directly from the descriptions of the problems that the systems are being used to solve. In so...