Bidirectional recurrent neural network(BRNN) is a noncausal generalization of recurrent neural network(RNN). It can not learn remote information efficiently due to the problem of ...
This paper addresses the issue of policy evaluation in Markov Decision Processes, using linear function approximation. It provides a unified view of algorithms such as TD(), LSTD()...
The main aim of this paper is to extend the single-agent policy gradient method for multiagent domains where all agents share the same utility function. We formulate these team pro...
Most conventional Policy Gradient Reinforcement Learning (PGRL) algorithms neglect (or do not explicitly make use of) a term in the average reward gradient with respect to the pol...
: Recurrent neural networks possess interesting universal approximation capabilities, making them good candidates for time series modeling. Unfortunately, long term dependencies ar...