temporal difference | Sciweavers

16

AAAI
2011

144views Intelligent Agents» more AAAI 2011»

Differential Eligibility Vectors for Advantage Updating and Gradient Methods

12 years 4 months ago

In this paper we propose differential eligibility vectors (DEV) for temporal-difference (TD) learning, a new class of eligibility vectors designed to bring out the contribution of...

Francisco S. Melo

claim paper

Read More »

15

click to vote

JMLR
2010

119views more JMLR 2010»

A Convergent Online Single Time Scale Actor Critic Algorithm

12 years 11 months ago

Download jmlr.csail.mit.edu

Actor-Critic based approaches were among the first to address reinforcement learning in a general setting. Recently, these algorithms have gained renewed interest due to their gen...

Dotan Di Castro, Ron Meir

claim paper

Read More »

14

click to vote

CDC
2010
IEEE

136views Control Systems» more CDC 2010»

Pathologies of temporal difference methods in approximate dynamic programming

12 years 11 months ago

Download web.mit.edu

Approximate policy iteration methods based on temporal differences are popular in practice, and have been tested extensively, dating to the early nineties, but the associated conve...

Dimitri P. Bertsekas

claim paper

Read More »

19

click to vote

ICML
2010
IEEE

222views Machine Learning» more ICML 2010»

Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda

13 years 2 months ago

Download www.icml2010.org

Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...

Carlton Downey, Scott Sanner

claim paper

Read More »

15

click to vote

CORR
2010
Springer

204views Education» more CORR 2010»

Predictive State Temporal Difference Learning

13 years 3 months ago

Download www.cs.cmu.edu

We propose a new approach to value function approximation which combines linear temporal difference reinforcement learning with subspace identiﬁcation. In practical applications...

Byron Boots, Geoffrey J. Gordon

claim paper

Read More »

17

click to vote

NCA
2008
IEEE

165views Computer Networks» more NCA 2008»

Neurodynamic programming: a case study of the traveling salesman problem

13 years 4 months ago

Download www.ece.uic.edu

The paper focuses on the study of solving the large-scale traveling salesman problem (TSP) based on neurodynamic programming. From this perspective, two methods, temporal differenc...

Jia Ma, Tao Yang, Zeng-Guang Hou, Min Tan, Derong ...

claim paper

Read More »

15

click to vote

IAT
2008
IEEE

161views Intelligent Agents» more IAT 2008»

Scaling Up Multi-agent Reinforcement Learning in Complex Domains

13 years 4 months ago

Download www3.ntu.edu.sg

TD-FALCON (Temporal Difference - Fusion Architecture for Learning, COgnition, and Navigation) is a class of self-organizing neural networks that incorporates Temporal Difference (...

Dan Xiao, Ah-Hwee Tan

claim paper

Read More »

17

click to vote

CEC
2010
IEEE

216views Artificial Intelligence» more CEC 2010»

Coevolutionary Temporal Difference Learning for small-board Go

13 years 4 months ago

Download www.cs.put.poznan.pl

—In this paper we apply Coevolutionary Temporal Difference Learning (CTDL), a hybrid of coevolutionary search and reinforcement learning proposed in our former study, to evolve s...

Krzysztof Krawiec, Marcin Szubert

posted by mszubert

Read More »

18

click to vote

NIPS
2001

206views Information Technology» more NIPS 2001»

Model-Free Least-Squares Policy Iteration

13 years 6 months ago

Download www.cs.duke.edu

We propose a new approach to reinforcement learning which combines least squares function approximation with policy iteration. Our method is model-free and completely off policy. ...

Michail G. Lagoudakis, Ronald Parr

claim paper

Read More »

13

click to vote

NIPS
2008

130views Information Technology» more NIPS 2008»

Temporal Difference Based Actor Critic Learning - Convergence and Neural Implementation

13 years 6 months ago

Download eprints.pascal-network.org

Actor-critic algorithms for reinforcement learning are achieving renewed popularity due to their good convergence properties in situations where other approaches often fail (e.g.,...

Dotan Di Castro, Dmitry Volkinshtein, Ron Meir

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers