Search Sciweavers | Sciweavers

536 search results - page 1 / 108

» Residual Algorithms: Reinforcement Learning with Function Ap...

160

click to vote

ICML
1995
IEEE

184views Machine Learning» more ICML 1995»

Residual Algorithms: Reinforcement Learning with Function Approximation

16 years 6 months ago

Download www.leemon.com

A number of reinforcement learning algorithms have been developed that are guaranteed to converge to the optimal solution when used with lookup tables. It is shown, however, that ...

Leemon C. Baird III

claim paper

Read More »

152

click to vote

ICML
2004
IEEE

145views Machine Learning» more ICML 2004»

Convergence of synchronous reinforcement learning with linear function approximation

16 years 6 months ago

Download www.machinelearning.org

Synchronous reinforcement learning (RL) algorithms with linear function approximation are representable as inhomogeneous matrix iterations of a special form (Schoknecht & Merk...

Artur Merke, Ralf Schoknecht

claim paper

Read More »

131

click to vote

ICML
2003
IEEE

146views Machine Learning» more ICML 2003»

TD(0) Converges Provably Faster than the Residual Gradient Algorithm

16 years 6 months ago

Download www.hpl.hp.com

In Reinforcement Learning (RL) there has been some experimental evidence that the residual gradient algorithm converges slower than the TD(0) algorithm. In this paper, we use the ...

Ralf Schoknecht, Artur Merke

claim paper

Read More »

159

click to vote

ICMLA
2008

195views Machine Learning» more ICMLA 2008»

Basis Function Construction in Reinforcement Learning Using Cascade-Correlation Learning Architecture

15 years 6 months ago

Download www.grappa.univ-lille3.fr

In reinforcement learning, it is a common practice to map the state(-action) space to a different one using basis functions. This transformation aims to represent the input data i...

Sertan Girgin, Philippe Preux

claim paper

Read More »

160

click to vote

ICML
2001
IEEE

185views Machine Learning» more ICML 2001»

Off-Policy Temporal Difference Learning with Function Approximation

16 years 6 months ago

Download www.cs.ualberta.ca

We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...

Doina Precup, Richard S. Sutton, Sanjoy Dasgupta

claim paper

Read More »

« Prev « First page 1 / 108 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers