Reinforcement Learning and Shaping: Encouraging Intended Behaviors

11 years 14 days ago
Reinforcement Learning and Shaping: Encouraging Intended Behaviors
We explore dynamic shaping to integrate our prior beliefs of the final policy into a conventional reinforcement learning system. Shaping provides a positive or negative artificial increment to the native task rewards in order to encourage or discourage behaviors. Previously, shaping functions have been static: the additional rewards do not vary with experience. But some prior knowledge cannot be expressed as static shaping. We take an explanation-based approach in which the specific shaping function emerges from initial experiences with the world. We compare no shaping, static shaping, and dynamic shaping in the task of learning bipedal-walking on a simulator. We empirically evaluate the convergence rate and final performance among these conditions while varying the accuracy of the prior knowledge. We conclude that in the appropriate context, dynamic shaping can greatly improve the learning of action policies.
Adam Laud, Gerald DeJong
Added 17 Nov 2009
Updated 17 Nov 2009
Type Conference
Year 2002
Where ICML
Authors Adam Laud, Gerald DeJong
Comments (0)