Search Sciweavers | Sciweavers

210 search results - page 16 / 42

» An analysis of reinforcement learning with function approxim...

142

click to vote

NIPS
2001

144views Information Technology» more NIPS 2001»

Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning

15 years 6 months ago

Download jmlr.csail.mit.edu

Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...

Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...

claim paper

Read More »

152

click to vote

CDC
2010
IEEE

160views Control Systems» more CDC 2010»

Adaptive bases for Q-learning

15 years 6 days ago

Download webee.technion.ac.il

Abstract-- We consider reinforcement learning, and in particular, the Q-learning algorithm in large state and action spaces. In order to cope with the size of the spaces, a functio...

Dotan Di Castro, Shie Mannor

claim paper

Read More »

133

click to vote

ICML
2007
IEEE

204views Machine Learning» more ICML 2007»

Constructing basis functions from directed graphs for value function approximation

16 years 6 months ago

Download www.machinelearning.org

Basis functions derived from an undirected graph connecting nearby samples from a Markov decision process (MDP) have proven useful for approximating value functions. The success o...

Jeffrey Johns, Sridhar Mahadevan

claim paper

Read More »

179

click to vote

UAI
2008

242views Artificial Intelligence» more UAI 2008»

Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping

15 years 6 months ago

Download uai2008.cs.helsinki.fi

We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available afte...

Richard S. Sutton, Csaba Szepesvári, Alborz...

claim paper

Read More »

123

Voted

ATAL
2009
Springer

198views Intelligent Agents» more ATAL 2009»

SarsaLandmark: an algorithm for learning in POMDPs with landmarks

15 years 12 months ago

Download www.aamas-conference.org

Reinforcement learning algorithms that use eligibility traces, such as Sarsa(λ), have been empirically shown to be effective in learning good estimated-state-based policies in pa...

Michael R. James, Satinder P. Singh

claim paper

Read More »

« Prev « First page 16 / 42 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers