AbstractGroup utility functions are an extension of the common team utility function for providing multiple agents with a common reinforcement learning signal for learning cooperat...
Aiming to clarify the convergence or divergence conditions for Learning Classifier System (LCS), this paper explores: (1) an extreme condition where the reinforcement process of ...
For a Markov Decision Process with finite state (size S) and action spaces (size A per state), we propose a new algorithm--Delayed Q-Learning. We prove it is PAC, achieving near o...
Alexander L. Strehl, Lihong Li, Eric Wiewiora, Joh...
This paper describes Icarus, an agent architecture that embeds a hierarchical reinforcement learning algorithm within a language for specifying agent behavior. An Icarus program e...
Autonomous agents that learn about their environment can be divided into two broad classes. One class of existing learners, reinforcement learners, typically employ weak learning ...