Recent research has shown the benefit of framing problems of imitation learning as solutions to Markov Decision Problems. This approach reduces learning to the problem of recoveri...
Brian Ziebart, Andrew L. Maas, J. Andrew Bagnell, ...
Learning to act in a multiagent environment is a difficult problem since the normal definition of an optimal policy no longer applies. The optimal policy at any moment depends on ...
Abstract. We present an implementation of model-based online reinforcement learning (RL) for continuous domains with deterministic transitions that is specifically designed to achi...
There has been a lot of recent work on Bayesian methods for reinforcement learning exhibiting near-optimal online performance. The main obstacle facing such methods is that in most...
Many algorithms such as Q-learning successfully address reinforcement learning in single-agent multi-time-step problems. In addition there are methods that address reinforcement l...