In this paper, we propose a policy gradient reinforcement learning algorithm to address transition-independent Dec-POMDPs. This approach aims at implicitly exploiting the locality...
In this paper we partially describe JV2 M, a metaphorical simulation of the Java Virtual Machine where students can learn Java language compilation and reinforce object-oriented pr...
Abstract. The paper presents a method to guide the self-organised development of behaviours of autonomous robots. In earlier publications we demonstrated how to use the homeokinesi...
We present an efficient "sparse sampling" technique for approximating Bayes optimal decision making in reinforcement learning, addressing the well known exploration vers...
Tao Wang, Daniel J. Lizotte, Michael H. Bowling, D...
Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...
Frank Sehnke, Alex Graves, Christian Osendorfer, J...