Search Sciweavers | Sciweavers

109 search results - page 2 / 22

» Policy teaching through reward function learning

click to vote

ICRA
2009
IEEE

143views Robotics» more ICRA 2009»

Least absolute policy iteration for robust value function approximation

13 years 11 months ago

Download sugiyama-www.cs.titech.ac.jp

Abstract— Least-squares policy iteration is a useful reinforcement learning method in robotics due to its computational efﬁciency. However, it tends to be sensitive to outliers...

Masashi Sugiyama, Hirotaka Hachiya, Hisashi Kashim...

claim paper

Read More »

click to vote

IJCAI
2001

163views Artificial Intelligence» more IJCAI 2001»

Exploiting Multiple Secondary Reinforcers in Policy Gradient Reinforcement Learning

13 years 5 months ago

Download www.cs.colorado.edu

Most formulations of Reinforcement Learning depend on a single reinforcement reward value to guide the search for the optimal policy solution. If observation of this reward is rar...

Gregory Z. Grudic, Lyle H. Ungar

claim paper

Read More »

click to vote

ROMAN
2007
IEEE

134views Robotics» more ROMAN 2007»

Learning Reward Modalities for Human-Robot-Interaction in a Cooperative Training Task

13 years 10 months ago

Download www.robotopia.de

—This paper proposes a novel method of learning a users preferred reward modalities for human-robot interaction through solving a cooperative training task. A learning algorithm ...

Anja Austermann, Seiji Yamada

claim paper

Read More »

click to vote

NN
2007
Springer

105views Neural Networks» more NN 2007»

Guiding exploration by pre-existing knowledge without modifying reward

13 years 3 months ago

Download www.cs.hut.fi

Reinforcement learning is based on exploration of the environment and receiving reward that indicates which actions taken by the agent are good and which ones are bad. In many app...

Kary Främling

claim paper

Read More »

click to vote

ICML
2004
IEEE

214views Machine Learning» more ICML 2004»

Apprenticeship learning via inverse reinforcement learning

14 years 5 months ago

Download ai.stanford.edu

We consider learning in a Markov decision process where we are not explicitly given a reward function, but where instead we can observe an expert demonstrating the task that we wa...

Pieter Abbeel, Andrew Y. Ng

claim paper

Read More »

« Prev « First page 2 / 22 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers