Sciweavers

115 search results - page 2 / 23
» Recurrent policy gradients
Sort
View
ICPR
2004
IEEE
14 years 6 months ago
Improvement of Bidirectional Recurrent Neural Network for Learning Long-Term Dependencies
Bidirectional recurrent neural network(BRNN) is a noncausal generalization of recurrent neural network(RNN). It can not learn remote information efficiently due to the problem of ...
Jinmiao Chen, Narendra S. Chaudhari
CORR
2006
Springer
113views Education» more  CORR 2006»
13 years 5 months ago
A Unified View of TD Algorithms; Introducing Full-Gradient TD and Equi-Gradient Descent TD
This paper addresses the issue of policy evaluation in Markov Decision Processes, using linear function approximation. It provides a unified view of algorithms such as TD(), LSTD()...
Manuel Loth, Philippe Preux
IDEAL
2004
Springer
13 years 10 months ago
Policy Gradient Method for Team Markov Games
The main aim of this paper is to extend the single-agent policy gradient method for multiagent domains where all agents share the same utility function. We formulate these team pro...
Ville Könönen
NECO
2010
97views more  NECO 2010»
13 years 3 months ago
Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning
Most conventional Policy Gradient Reinforcement Learning (PGRL) algorithms neglect (or do not explicitly make use of) a term in the average reward gradient with respect to the pol...
Tetsuro Morimura, Eiji Uchibe, Junichiro Yoshimoto...
ESANN
2000
13 years 6 months ago
An algorithm for the addition of time-delayed connections to recurrent neural networks
: Recurrent neural networks possess interesting universal approximation capabilities, making them good candidates for time series modeling. Unfortunately, long term dependencies ar...
Romuald Boné, Michel Crucianu, Jean Pierre ...