Dynamic Reward Shaping: Training a Robot by Voice

15 years 2 months ago

Download ccc.inaoep.mx

Reinforcement Learning is commonly used for learning tasks in robotics, however, traditional algorithms can take very long training times. Reward shaping has been recently used to provide domain knowledge with extra rewards to converge faster. The reward shaping functions are normally deﬁned in advance by the user and are static. This paper introduces a dynamic reward shaping approach, in which these extra rewards are not consistently given, can vary with time and may sometimes be contrary to what is needed for achieving a goal. In the experiments, a user provides verbal feedback while a robot is performing a task which is translated into additional rewards. It is shown that we can still guarantee convergence as long as most of the shaping rewards given per state are consistent with the goals and that even with fairly noisy interaction the system can still produce faster convergence times than traditional reinforcement learning techniques.

Ana C. Tenorio-Gonzalez, Eduardo F. Morales, Luis

Real-time Traffic

Artificial Intelligence | Extra Rewards | IBERAMIA 2010 | Long Training Times | Reward Shaping Approach |

claim paper

» How robot morphology and training order affect the learning of multiple behaviors

» Development of a Femininity Estimator for Voice Therapy of Gender Identity Disorder Client...

» When Policies Can Be Trusted Analyzing a Criteria to Identify Optimal Policies in MDPs wit...

» Learning Dynamics of Complex Motions from Image Sequences

» Emergence Exploration and Learning of Embodied Behavior

» The Performance of Approximating Ordinary Differential Equations by Neural Nets

Post Info
More Details (n/a)

Added	25 Jan 2011
Updated	25 Jan 2011
Type	Journal
Year	2010
Where	IBERAMIA
Authors	Ana C. Tenorio-Gonzalez, Eduardo F. Morales, Luis Villaseñor Pineda

Comments (0)

Sciweavers

Dynamic Reward Shaping: Training a Robot by Voice

Artificial Intelligence | Extra Rewards | IBERAMIA 2010 | Long Training Times | Reward Shaping Approach |

Explore & Download

Productivity Tools

Sciweavers