Sciweavers

52 search results - page 7 / 11
» Error Bounds for Approximate Policy Iteration
Sort
View
ICML
2004
IEEE
15 years 10 months ago
Approximate inference by Markov chains on union spaces
A standard method for approximating averages in probabilistic models is to construct a Markov chain in the product space of the random variables with the desired equilibrium distr...
Max Welling, Michal Rosen-Zvi, Yee Whye Teh
UAI
2008
14 years 11 months ago
CORL: A Continuous-state Offset-dynamics Reinforcement Learner
Continuous state spaces and stochastic, switching dynamics characterize a number of rich, realworld domains, such as robot navigation across varying terrain. We describe a reinfor...
Emma Brunskill, Bethany R. Leffler, Lihong Li, Mic...
SIAMIS
2011
14 years 4 months ago
Gradient-Based Methods for Sparse Recovery
The convergence rate is analyzed for the sparse reconstruction by separable approximation (SpaRSA) algorithm for minimizing a sum f(x) + ψ(x), where f is smooth and ψ is convex, ...
William W. Hager, Dzung T. Phan, Hongchao Zhang
FUIN
2010
143views more  FUIN 2010»
14 years 7 months ago
Cluster Tree Elimination for Distributed Constraint Optimization with Quality Guarantees
Some distributed constraint optimization algorithms use a linear number of messages in the number of agents, but of exponential size. This is often the main limitation for their pr...
Ismel Brito, Pedro Meseguer
78
Voted
NIPS
2001
14 years 11 months ago
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...