Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

115

ATAL
2015
Springer

favoriteEmaildiscussreport

24views Intelligent Agents» more ATAL 2015»

Policy Transfer using Reward Shaping

9 years 9 months ago

Policy Transfer using Reward Shaping

Download ai.vub.ac.be

Transfer learning has proven to be a wildly successful approach for speeding up reinforcement learning. Techniques often use low-level information obtained in the source task to achieve successful transfer in the target task. Yet, a most general transfer approach can only assume access to the output of the learning algorithm in the source task, i.e. the learned policy, enabling transfer irrespective of the learning algorithm used in the source task. We advance the state-ofthe-art by using a reward shaping approach to policy transfer. One of the advantages in following such an approach, is that it ﬁrmly grounds policy transfer in an actively developing body of theoretical research on reward shaping. Experiments in Mountain Car, Cart Pole and Mario demonstrate the practical usefulness of the approach. Categories and Subject Descriptors I.2.6 [Learning]: Miscellaneous General Terms Algorithms, Performance Keywords Reinforcement Learning; Transfer Learning; Reward Shaping

Tim Brys, Anna Harutyunyan, Matthew E. Taylor, Ann

Real-time Traffic

ATAL 2015 | Intelligent Agents |

claim paper

Related Content

» Autonomous shaping knowledge transfer in reinforcement learning

» Probabilistic Policy Reuse for intertask transfer learning

» The Influence of Reward on the Speed of Reinforcement Learning An Analysis of Shaping

» Reward shaping for valuing communications during multiagent coordination

» Interactively shaping agents via human reinforcement the TAMER framework

» Transfer of task representation in reinforcement learning using policybased protovalue fun...

» Combining manual feedback with subsequent MDP reward signals for reinforcement learning

» A Survey of the UseItOrLoseIt Policies for the ABR Service in ATM Networks

» Useit or Loseit Policies for the Available Bit Rate ABR Service in ATM Networks

» Goaldirected decision making in prefrontal cortex a computational framework

Post Info
More Details (n/a)

Added	16 Apr 2016
Updated	16 Apr 2016
Type	Journal
Year	2015
Where	ATAL
Authors	Tim Brys, Anna Harutyunyan, Matthew E. Taylor, Ann Nowé

Comments (0)