Sciweavers

82
Voted
NCI
2004

Hierarchical reinforcement learning with subpolicies specializing for learned subgoals

14 years 10 months ago
Hierarchical reinforcement learning with subpolicies specializing for learned subgoals
This paper describes a method for hierarchical reinforcement learning in which high-level policies automatically discover subgoals, and low-level policies learn to specialize for different subgoals. Subgoals are represented as destract observations which cluster raw input data. High-level value functions cover the state space at a coarse level; low-level value functions cover only parts of the state space at a fine-grained level. An experiment shows that this method outperforms several flat reinforcement learning methods. A second experiment shows how problems of observability due to observation abstraction can be overcome using high-level policies with memory. Key words Reinforcement learning, hierarchical reinforcement learning, feedforward neural networks, recurrent neural networks, MDPs, POMDPs, short-term memory
Bram Bakker, Jürgen Schmidhuber
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2004
Where NCI
Authors Bram Bakker, Jürgen Schmidhuber
Comments (0)