We consider learning in a Markov decision process where we are not explicitly given a reward function, but where instead we can observe an expert demonstrating the task that we wa...
Abstract— The goal of developing algorithms for programming robots by demonstration is to create an easy way of programming robots that can be accomplished by everyone. When a de...
We generalise the problem of inverse reinforcement learning to multiple tasks, from multiple demonstrations. Each one may represent one expert trying to solve a different task, or ...
Recent research has shown the benefit of framing problems of imitation learning as solutions to Markov Decision Problems. This approach reduces learning to the problem of recoveri...
Brian Ziebart, Andrew L. Maas, J. Andrew Bagnell, ...
Abstract. The application of reinforcement learning algorithms to multiagent domains may cause complex non-convergent dynamics. The replicator dynamics, commonly used in evolutiona...
Alessandro Lazaric, Jose Enrique Munoz de Cote, Fa...