We address two open theoretical questions in Policy Gradient Reinforcement Learning. The first concerns the efficacy of using function approximation to represent the state action ...
Abstract--The high-order phase function (HPF) is a useful tool to estimate the instantaneous frequency rate (IFR) of a signal with a polynomial phase. In this paper, the asymptotic...
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
We investigate why discretization is effective in naive-Bayes learning. We prove a theorem that identifies particular conditions under which discretization will result in naiveBay...
Image synthesis algorithms are commonly compared on the basis of running times and/or perceived quality of the generated images. In the case of Monte Carlo techniques, assessment ...