Sciweavers

65 search results - page 3 / 13
» Bias and variance in value function estimation
Sort
View
NIPS
2001
13 years 7 months ago
Rates of Convergence of Performance Gradient Estimates Using Function Approximation and Bias in Reinforcement Learning
We address two open theoretical questions in Policy Gradient Reinforcement Learning. The first concerns the efficacy of using function approximation to represent the state action ...
Gregory Z. Grudic, Lyle H. Ungar
TSP
2010
13 years 24 days ago
Performance of instantaneous frequency rate estimation using high-order phase function
Abstract--The high-order phase function (HPF) is a useful tool to estimate the instantaneous frequency rate (IFR) of a signal with a polynomial phase. In this paper, the asymptotic...
Pu Wang, Hongbin Li, Igor Djurovic, Braham Himed
NIPS
2001
13 years 7 months ago
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
AUSAI
2003
Springer
13 years 9 months ago
On Why Discretization Works for Naive-Bayes Classifiers
We investigate why discretization is effective in naive-Bayes learning. We prove a theorem that identifies particular conditions under which discretization will result in naiveBay...
Ying Yang, Geoffrey I. Webb
PG
2007
IEEE
14 years 12 days ago
Statistical Hypothesis Testing for Assessing Monte Carlo Estimators: Applications to Image Synthesis
Image synthesis algorithms are commonly compared on the basis of running times and/or perceived quality of the generated images. In the case of Monte Carlo techniques, assessment ...
Kartic Subr, James Arvo