Search Sciweavers | Sciweavers

6 search results - page 1 / 2

» Postponed Updates for Temporal-Difference Reinforcement Lear...

195

Voted

ISDA
2009
IEEE

144views Operating System» more ISDA 2009»

Postponed Updates for Temporal-Difference Reinforcement Learning

16 years 2 months ago

Download www.science.uva.nl

This paper presents postponed updates, a new strategy for TD methods that can improve sample efﬁciency without incurring the computational and space requirements of model-based ...

Harm van Seijen, Shimon Whiteson

claim paper

Read More »

225

click to vote

ICML
1999
IEEE

168views Machine Learning» more ICML 1999»

Least-Squares Temporal Difference Learning

16 years 8 months ago

Download www.research.rutgers.edu

Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...

Justin A. Boyan

claim paper

Read More »

197

Voted

NIPS
2007

164views Information Technology» more NIPS 2007»

Incremental Natural Actor-Critic Algorithms

15 years 8 months ago

Download books.nips.cc

We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...

Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...

claim paper

Read More »

222

Voted

JMLR
2010

119views more JMLR 2010»

A Convergent Online Single Time Scale Actor Critic Algorithm

15 years 2 months ago

Download jmlr.csail.mit.edu

Actor-Critic based approaches were among the first to address reinforcement learning in a general setting. Recently, these algorithms have gained renewed interest due to their gen...

Dotan Di Castro, Ron Meir

claim paper

Read More »

198

Voted

JMLR
2006

153views more JMLR 2006»

Collaborative Multiagent Reinforcement Learning by Payoff Propagation

15 years 7 months ago

Download jmlr.csail.mit.edu

In this article we describe a set of scalable techniques for learning the behavior of a group of agents in a collaborative multiagent setting. As a basis we use the framework of c...

Jelle R. Kok, Nikos A. Vlassis

claim paper

Read More »

« Prev « First page 1 / 2 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers