Search Sciweavers | Sciweavers

52 search results - page 7 / 11

» Error Bounds for Approximate Policy Iteration

155

click to vote

ICML
2004
IEEE

134views Machine Learning» more ICML 2004»

Approximate inference by Markov chains on union spaces

16 years 6 months ago

Download www.ics.uci.edu

A standard method for approximating averages in probabilistic models is to construct a Markov chain in the product space of the random variables with the desired equilibrium distr...

Max Welling, Michal Rosen-Zvi, Yee Whye Teh

claim paper

Read More »

185

Voted

UAI
2008

236views Artificial Intelligence» more UAI 2008»

CORL: A Continuous-state Offset-dynamics Reinforcement Learner

15 years 7 months ago

Download uai2008.cs.helsinki.fi

Continuous state spaces and stochastic, switching dynamics characterize a number of rich, realworld domains, such as robot navigation across varying terrain. We describe a reinfor...

Emma Brunskill, Bethany R. Leffler, Lihong Li, Mic...

claim paper

Read More »

270

click to vote

SIAMIS
2011

265views Software Engineering» more SIAMIS 2011»

Gradient-Based Methods for Sparse Recovery

15 years 24 days ago

Download www.math.ufl.edu

The convergence rate is analyzed for the sparse reconstruction by separable approximation (SpaRSA) algorithm for minimizing a sum f(x) + ψ(x), where f is smooth and ψ is convex, ...

William W. Hager, Dzung T. Phan, Hongchao Zhang

claim paper

Read More »

149

Voted

FUIN
2010

143views more FUIN 2010»

Cluster Tree Elimination for Distributed Constraint Optimization with Quality Guarantees

15 years 3 months ago

Download www.iiia.csic.es

Some distributed constraint optimization algorithms use a linear number of messages in the number of agents, but of exponential size. This is often the main limitation for their pr...

Ismel Brito, Pedro Meseguer

claim paper

Read More »

149

click to vote

NIPS
2001

144views Information Technology» more NIPS 2001»

Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning

15 years 7 months ago

Download jmlr.csail.mit.edu

Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...

Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...

claim paper

Read More »

« Prev « First page 7 / 11 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers