In this paper, we look at a supply chain of commodity goods where customer demand is uncertain and partly based on reputation, and where raw material replenishment is uncertain in...
We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available afte...
Helicopter hovering is an important challenge problem in the field of reinforcement learning. This paper considers several neuroevolutionary approaches to discovering robust cont...
Decentralized Markov decision processes are frequently used to model cooperative multi-agent systems. In this paper, we identify a subclass of general DEC-MDPs that features regul...
Programming robots to carry out useful tasks is both a complex and non-trivial exercise. A simple and intuitive method to allow humans to train and shape robot behaviour is clearl...
Joe Saunders, Chrystopher L. Nehaniv, Kerstin Daut...