Search Sciweavers | Sciweavers

51 search results - page 1 / 11

» Exponentiated Gradient Methods for Reinforcement Learning

160

Voted

ICML
1997
IEEE

123views Machine Learning» more ICML 1997»

Exponentiated Gradient Methods for Reinforcement Learning

16 years 8 months ago

Download www.cs.ualberta.ca

Doina Precup, Richard S. Sutton

claim paper

Read More »

217

click to vote

IJCAI
2001

163views Artificial Intelligence» more IJCAI 2001»

Exploiting Multiple Secondary Reinforcers in Policy Gradient Reinforcement Learning

15 years 9 months ago

Download www.cs.colorado.edu

Most formulations of Reinforcement Learning depend on a single reinforcement reward value to guide the search for the optimal policy solution. If observation of this reward is rar...

Gregory Z. Grudic, Lyle H. Ungar

claim paper

Read More »

200

click to vote

IJCAI
2003

169views Artificial Intelligence» more IJCAI 2003»

Covariant Policy Search

15 years 9 months ago

Download www.ri.cmu.edu

We investigate the problem of non-covariant behavior of policy gradient reinforcement learning algorithms. The policy gradient approach is amenable to analysis by information geom...

J. Andrew Bagnell, Jeff G. Schneider

claim paper

Read More »

202

click to vote

ICML
2007
IEEE

141views Machine Learning» more ICML 2007»

Exponentiated gradient algorithms for log-linear structured prediction

16 years 8 months ago

Download www.machinelearning.org

Conditional log-linear models are a commonly used method for structured prediction. Efficient learning of parameters in these models is therefore an important problem. This paper ...

Amir Globerson, Terry Koo, Xavier Carreras, Michae...

claim paper

Read More »

186

click to vote

NIPS
2001

144views Information Technology» more NIPS 2001»

Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning

15 years 9 months ago

Download jmlr.csail.mit.edu

Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...

Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...

claim paper

Read More »

« Prev « First page 1 / 11 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers