We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...
—In this paper, we analyze the bounds of the fixed common step-size parameter GMDFµ for the generalized multidelay adaptive filter (GMDF). Frequency domain adaptive filters are ...
In this paper, a method for the performance assessment of a variable-gain control design for optical storage drives is proposed. The variablegain strategy is used to overcome well...
Nathan van de Wouw, H. A. Pastink, Marcel F. Heert...
In this paper, we consider the problem of planning and learning in the infinite-horizon discounted-reward Markov decision problems. We propose a novel iterative direct policysearc...
The design of sensor networks capable of reaching a consensus on a globally optimal decision test, without the need for a fusion center, is a problem that has received considerable...