Sciweavers

109 search results - page 3 / 22
» Policy teaching through reward function learning
Sort
View
ICONIP
2007
13 years 6 months ago
Finding Exploratory Rewards by Embodied Evolution and Constrained Reinforcement Learning in the Cyber Rodents
The aim of the Cyber Rodent project [1] is to elucidate the origin of our reward and affective systems by building artificial agents that share the natural biological constraints...
Eiji Uchibe, Kenji Doya
PKDD
2009
Springer
181views Data Mining» more  PKDD 2009»
13 years 11 months ago
Active Learning for Reward Estimation in Inverse Reinforcement Learning
Abstract. Inverse reinforcement learning addresses the general problem of recovering a reward function from samples of a policy provided by an expert/demonstrator. In this paper, w...
Manuel Lopes, Francisco S. Melo, Luis Montesano
ICML
2008
IEEE
14 years 5 months ago
Learning all optimal policies with multiple criteria
We describe an algorithm for learning in the presence of multiple criteria. Our technique generalizes previous approaches in that it can learn optimal policies for all linear pref...
Leon Barrett, Srini Narayanan
FLAIRS
2003
13 years 6 months ago
Learning from Reinforcement and Advice Using Composite Reward Functions
1 Reinforcement learning has become a widely used methodology for creating intelligent agents in a wide range of applications. However, its performance deteriorates in tasks with s...
Vinay N. Papudesi, Manfred Huber
ICMLA
2010
13 years 2 months ago
Multimodal Parameter-exploring Policy Gradients
Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...
Frank Sehnke, Alex Graves, Christian Osendorfer, J...