Search Sciweavers | Sciweavers

132 search results - page 13 / 27

» Generalization in Reinforcement Learning: Safely Approximati...

229

click to vote

JMLR
2010

119views more JMLR 2010»

A Convergent Online Single Time Scale Actor Critic Algorithm

15 years 2 months ago

Download jmlr.csail.mit.edu

Actor-Critic based approaches were among the first to address reinforcement learning in a general setting. Recently, these algorithms have gained renewed interest due to their gen...

Dotan Di Castro, Ron Meir

claim paper

Read More »

230

click to vote

ICML
1999
IEEE

168views Machine Learning» more ICML 1999»

Least-Squares Temporal Difference Learning

16 years 8 months ago

Download www.research.rutgers.edu

Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...

Justin A. Boyan

claim paper

Read More »

203

click to vote

JMLR
2006

153views more JMLR 2006»

Collaborative Multiagent Reinforcement Learning by Payoff Propagation

15 years 7 months ago

Download jmlr.csail.mit.edu

In this article we describe a set of scalable techniques for learning the behavior of a group of agents in a collaborative multiagent setting. As a basis we use the framework of c...

Jelle R. Kok, Nikos A. Vlassis

claim paper

Read More »

227

click to vote

AAAI
1998

181views Intelligent Agents» more AAAI 1998»

Applying Online Search Techniques to Continuous-State Reinforcement Learning

15 years 9 months ago

Download www.autonlab.org

In this paper, we describe methods for e ciently computing better solutions to control problems in continuous state spaces. We provide algorithms that exploit online search to boo...

Scott Davies, Andrew Y. Ng, Andrew W. Moore

claim paper

Read More »

178

click to vote

ICML
2003
IEEE

146views Machine Learning» more ICML 2003»

TD(0) Converges Provably Faster than the Residual Gradient Algorithm

16 years 8 months ago

Download www.hpl.hp.com

In Reinforcement Learning (RL) there has been some experimental evidence that the residual gradient algorithm converges slower than the TD(0) algorithm. In this paper, we use the ...

Ralf Schoknecht, Artur Merke

claim paper

Read More »

« Prev « First page 13 / 27 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers