We study a sequential variance reduction technique for Monte Carlo estimation of functionals in Markov Chains. The method is based on designing sequential control variates using s...
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
We present an algorithm to generate samples from probability distributions on the space of curves. Traditional curve evolution methods use gradient descent to find a local minimum...
Ayres C. Fan, John W. Fisher III, Jonathan Kane, A...