In this paper we propose differential eligibility vectors (DEV) for temporal-difference (TD) learning, a new class of eligibility vectors designed to bring out the contribution of...
Actor-Critic based approaches were among the first to address reinforcement learning in a general setting. Recently, these algorithms have gained renewed interest due to their gen...
Approximate policy iteration methods based on temporal differences are popular in practice, and have been tested extensively, dating to the early nineties, but the associated conve...
Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...
We propose a new approach to value function approximation which combines linear temporal difference reinforcement learning with subspace identification. In practical applications...
The paper focuses on the study of solving the large-scale traveling salesman problem (TSP) based on neurodynamic programming. From this perspective, two methods, temporal differenc...
Jia Ma, Tao Yang, Zeng-Guang Hou, Min Tan, Derong ...
TD-FALCON (Temporal Difference - Fusion Architecture for Learning, COgnition, and Navigation) is a class of self-organizing neural networks that incorporates Temporal Difference (...
—In this paper we apply Coevolutionary Temporal Difference Learning (CTDL), a hybrid of coevolutionary search and reinforcement learning proposed in our former study, to evolve s...
We propose a new approach to reinforcement learning which combines least squares function approximation with policy iteration. Our method is model-free and completely off policy. ...
Actor-critic algorithms for reinforcement learning are achieving renewed popularity due to their good convergence properties in situations where other approaches often fail (e.g.,...