Sciweavers

109 search results - page 4 / 22
» Policy teaching through reward function learning
Sort
View
104
Voted
IIE
2007
63views more  IIE 2007»
15 years 1 months ago
Investigation of Q-Learning in the Context of a Virtual Learning Environment
We investigate the possibility to apply a known machine learning algorithm of Q-learning in the domain of a Virtual Learning Environment (VLE). It is important in this problem doma...
Dalia Baziukaite
AAMAS
2010
Springer
15 years 1 months ago
Teaching a pet-robot to understand user feedback through interactive virtual training tasks
Abstract In this paper, we present a human-robot teaching framework that uses "virtual" games as a means for adapting a robot to its user through natural interaction in a...
Anja Austermann, Seiji Yamada
127
Voted
IJCAI
2007
15 years 3 months ago
Bayesian Inverse Reinforcement Learning
Inverse Reinforcement Learning (IRL) is the problem of learning the reward function underlying a Markov Decision Process given the dynamics of the system and the behaviour of an e...
Deepak Ramachandran, Eyal Amir
121
Voted
NECO
2010
97views more  NECO 2010»
15 years 5 days ago
Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning
Most conventional Policy Gradient Reinforcement Learning (PGRL) algorithms neglect (or do not explicitly make use of) a term in the average reward gradient with respect to the pol...
Tetsuro Morimura, Eiji Uchibe, Junichiro Yoshimoto...
112
Voted
ICML
2002
IEEE
16 years 2 months ago
Hierarchically Optimal Average Reward Reinforcement Learning
Two notions of optimality have been explored in previous work on hierarchical reinforcement learning (HRL): hierarchical optimality, or the optimal policy in the space defined by ...
Mohammad Ghavamzadeh, Sridhar Mahadevan