Sciweavers

SCAI
2008

Fast Learning in an Actor-Critic Architecture with Reward and Punishment

13 years 6 months ago
Fast Learning in an Actor-Critic Architecture with Reward and Punishment
Abstract. A reinforcement architecture is introduced that consists of three complementary learning systems with different generalization abilities. The ACTOR learns state-action associations, the CRITIC learns a goal-gradient, and the PUNISH system learns what actions to avoid. The architecture is compared to the standard actor-crititc and Q-learning models on a number of maze learning tasks. The novel architecture is shown to be superior on all the test mazes. Moreover, it shows how it is possible to combine several learning systems with different properties in a coherent reinforcement learning framework. Keywords. Reinforcement learning, reward, punishment, generalization
Christian Balkenius, Stefan Winberg
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2008
Where SCAI
Authors Christian Balkenius, Stefan Winberg
Comments (0)