Search Sciweavers | Sciweavers

139 search results - page 7 / 28

» Model-based function approximation in reinforcement learning

Voted

ICML
2007
IEEE

162views Machine Learning» more ICML 2007»

Automatic shaping and decomposition of reward functions

16 years 3 months ago

Download www.machinelearning.org

This paper investigates the problem of automatically learning how to restructure the reward function of a Markov decision process so as to speed up reinforcement learning. We begi...

Bhaskara Marthi

claim paper

Read More »

148

Voted

IAT
2003
IEEE

171views Intelligent Agents» more IAT 2003»

Asymmetric Multiagent Reinforcement Learning

15 years 8 months ago

Download lib.tkk.fi

A gradient-based method for both symmetric and asymmetric multiagent reinforcement learning is introduced in this paper. Symmetric multiagent reinforcement learning addresses the ...

Ville Könönen

claim paper

Read More »

118

Voted

ICRA
2009
IEEE

143views Robotics» more ICRA 2009»

Least absolute policy iteration for robust value function approximation

15 years 9 months ago

Download sugiyama-www.cs.titech.ac.jp

Abstract— Least-squares policy iteration is a useful reinforcement learning method in robotics due to its computational efﬁciency. However, it tends to be sensitive to outliers...

Masashi Sugiyama, Hirotaka Hachiya, Hisashi Kashim...

claim paper

Read More »

133

Voted

ESANN
2008

164views Neural Networks» more ESANN 2008»

Multilayer Perceptrons with Radial Basis Functions as Value Functions in Reinforcement Learning

15 years 4 months ago

Download www.dice.ucl.ac.be

Using multilayer perceptrons (MLPs) to approximate the state-action value function in reinforcement learning (RL) algorithms could become a nightmare due to the constant possibilit...

Victor Uc Cetina

claim paper

Read More »

136

Voted

ICML
2001
IEEE

185views Machine Learning» more ICML 2001»

Off-Policy Temporal Difference Learning with Function Approximation

16 years 3 months ago

Download www.cs.ualberta.ca

We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...

Doina Precup, Richard S. Sutton, Sanjoy Dasgupta

claim paper

Read More »

« Prev « First page 7 / 28 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers