Search Sciweavers | Sciweavers

210 search results - page 26 / 42

» An analysis of reinforcement learning with function approxim...

click to vote

ICML
2003
IEEE

146views Machine Learning» more ICML 2003»

TD(0) Converges Provably Faster than the Residual Gradient Algorithm

16 years 16 days ago

Download www.hpl.hp.com

In Reinforcement Learning (RL) there has been some experimental evidence that the residual gradient algorithm converges slower than the TD(0) algorithm. In this paper, we use the ...

Ralf Schoknecht, Artur Merke

claim paper

Read More »

click to vote

ICML
2008
IEEE

117views Machine Learning» more ICML 2008»

Sample-based learning and search with permanent and transient memories

16 years 16 days ago

Download www.cs.ualberta.ca

We present a reinforcement learning architecture, Dyna-2, that encompasses both samplebased learning and sample-based search, and that generalises across states during both learni...

David Silver, Martin Müller 0003, Richard S. ...

claim paper

Read More »

122

click to vote

NIPS
2008

130views Information Technology» more NIPS 2008»

Temporal Difference Based Actor Critic Learning - Convergence and Neural Implementation

15 years 1 months ago

Download eprints.pascal-network.org

Actor-critic algorithms for reinforcement learning are achieving renewed popularity due to their good convergence properties in situations where other approaches often fail (e.g.,...

Dotan Di Castro, Dmitry Volkinshtein, Ron Meir

claim paper

Read More »

127

click to vote

ICML
1999
IEEE

168views Machine Learning» more ICML 1999»

Least-Squares Temporal Difference Learning

16 years 16 days ago

Download www.research.rutgers.edu

Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...

Justin A. Boyan

claim paper

Read More »

113

click to vote

NIPS
2008

110views Information Technology» more NIPS 2008»

Signal-to-Noise Ratio Analysis of Policy Gradient Algorithms

15 years 1 months ago

Download groups.csail.mit.edu

Policy gradient (PG) reinforcement learning algorithms have strong (local) convergence guarantees, but their learning performance is typically limited by a large variance in the e...

John W. Roberts, Russ Tedrake

claim paper

Read More »

« Prev « First page 26 / 42 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers