We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...
A number of reinforcement learning algorithms have been developed that are guaranteed to converge to the optimal solution when used with lookup tables. It is shown, however, that ...
Many statistical translation models can be regarded as weighted logical deduction. Under this paradigm, we use weights from the expectation semiring (Eisner, 2002), to compute fir...
We present SpeedBoost, a natural extension of functional gradient descent, for learning anytime predictors, which automatically trade computation time for predictive accuracy by s...
The best recent supervised sequence learning methods use gradient descent to train networks of miniature nets called memory cells. The most popular cell structure seems somewhat ar...