We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...
In this paper, we consider the problem of planning and learning in the infinite-horizon discounted-reward Markov decision problems. We propose a novel iterative direct policysearc...
Efficient and accurate fitting of Active Appearance Models (AAM) is a key requirement for many applications. The most efficient fitting algorithm today is Inverse Compositiona...
Under certain conditions on the integrand, quasi-Monte Carlo methods for estimating integrals (expectations) converge faster asymptotically than Monte Carlo methods. Motivated by ...
Shane G. Henderson, Belinda A. Chiera, Roger M. Co...
The ratio of two probability densities can be used for solving various machine learning tasks such as covariate shift adaptation (importance sampling), outlier detection (likeliho...