Sciweavers

109 search results - page 2 / 22
» Policy teaching through reward function learning
Sort
View
ICRA
2009
IEEE
143views Robotics» more  ICRA 2009»
13 years 11 months ago
Least absolute policy iteration for robust value function approximation
Abstract— Least-squares policy iteration is a useful reinforcement learning method in robotics due to its computational efficiency. However, it tends to be sensitive to outliers...
Masashi Sugiyama, Hirotaka Hachiya, Hisashi Kashim...
IJCAI
2001
13 years 5 months ago
Exploiting Multiple Secondary Reinforcers in Policy Gradient Reinforcement Learning
Most formulations of Reinforcement Learning depend on a single reinforcement reward value to guide the search for the optimal policy solution. If observation of this reward is rar...
Gregory Z. Grudic, Lyle H. Ungar
ROMAN
2007
IEEE
134views Robotics» more  ROMAN 2007»
13 years 10 months ago
Learning Reward Modalities for Human-Robot-Interaction in a Cooperative Training Task
—This paper proposes a novel method of learning a users preferred reward modalities for human-robot interaction through solving a cooperative training task. A learning algorithm ...
Anja Austermann, Seiji Yamada
NN
2007
Springer
105views Neural Networks» more  NN 2007»
13 years 3 months ago
Guiding exploration by pre-existing knowledge without modifying reward
Reinforcement learning is based on exploration of the environment and receiving reward that indicates which actions taken by the agent are good and which ones are bad. In many app...
Kary Främling
ICML
2004
IEEE
14 years 5 months ago
Apprenticeship learning via inverse reinforcement learning
We consider learning in a Markov decision process where we are not explicitly given a reward function, but where instead we can observe an expert demonstrating the task that we wa...
Pieter Abbeel, Andrew Y. Ng