Optical motion capture can be classified as an inference problem: given the data produced by a set of cameras, the aim is to extract the hidden state, which in this case encodes t...
—This paper reviews the different gradient-based schemes and the sources of gradient, their availability, precision and computational complexity, and explores the benefits of usi...
Boyang Li, Yew-Soon Ong, Minh Nghia Le, Chi Keong ...
We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...
We investigate the problem of non-covariant behavior of policy gradient reinforcement learning algorithms. The policy gradient approach is amenable to analysis by information geom...
Stochastic gradient descent (SGD) uses approximate gradients estimated from subsets of the training data and updates the parameters in an online fashion. This learning framework i...