This paper presents postponed updates, a new strategy for TD methods that can improve sample efficiency without incurring the computational and space requirements of model-based ...
We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...
Actor-Critic based approaches were among the first to address reinforcement learning in a general setting. Recently, these algorithms have gained renewed interest due to their gen...
In this article we describe a set of scalable techniques for learning the behavior of a group of agents in a collaborative multiagent setting. As a basis we use the framework of c...