Sciweavers

1237 search results - page 203 / 248
» Simulation sampling with live-points
Sort
View
NIPS
2001
15 years 8 days ago
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
106
Voted
NIPS
2001
15 years 8 days ago
Model-Free Least-Squares Policy Iteration
We propose a new approach to reinforcement learning which combines least squares function approximation with policy iteration. Our method is model-free and completely off policy. ...
Michail G. Lagoudakis, Ronald Parr
GRAPHICSINTERFACE
2000
15 years 7 days ago
Adaptive Representation of Specular Light Flux
Caustics produce beautiful and intriguing illumination patterns. However, their complex behavior make them difficult to simulate accurately in all but the simplest configurations....
Normand Brière, Pierre Poulin
AAAI
1996
15 years 6 days ago
A Clinician's Tool for Analyzing Non-Compliance
We describe a computer program to assist a clinician with assessing the e cacy of treatments in experimental studies for which treatment assignment is random but subject complianc...
David Maxwell Chickering, Judea Pearl
WCE
2007
15 years 1 days ago
Bootstrap Confidence Interval for the Median Failure Time of Three-Parameter Weibull Distribution
— In many applications of failure time data analysis, it is important to perform inferences about the median of the distribution function in situations of failure time data model...
N. A. Ibrahim, A. Kudus