Search Sciweavers | Sciweavers

86 search results - page 14 / 18

» Evolution of reward functions for reinforcement learning

121

click to vote

GECCO
2009
Springer

124views Optimization» more GECCO 2009»

Reinforcement learning for games: failures and successes

15 years 6 months ago

Download www.gm.fh-koeln.de

We apply CMA-ES, an evolution strategy with covariance matrix adaptation, and TDL (Temporal Difference Learning) to reinforcement learning tasks. In both cases these algorithms se...

Wolfgang Konen, Thomas Bartz-Beielstein

claim paper

Read More »

108

click to vote

ML
1998
ACM

136views Machine Learning» more ML 1998»

Co-Evolution in the Successful Learning of Backgammon Strategy

15 years 1 months ago

Download www.demo.cs.brandeis.edu

Following Tesauro’s work on TD-Gammon, we used a 4000 parameter feed-forward neural network to develop a competitive backgammon evaluation function. Play proceeds by a roll of t...

Jordan B. Pollack, Alan D. Blair

claim paper

Read More »

105

click to vote

ATAL
2008
Springer

123views Intelligent Agents» more ATAL 2008»

Sigma point policy iteration

15 years 3 months ago

Download web.mit.edu

In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...

Michael H. Bowling, Alborz Geramifard, David Winga...

claim paper

Read More »

136

click to vote

ICML
2000
IEEE

153views Machine Learning» more ICML 2000»

Eligibility Traces for Off-Policy Policy Evaluation

16 years 2 months ago

Download www.cs.ualberta.ca

Eligibility traces have been shown to speed reinforcement learning, to make it more robust to hidden states, and to provide a link between Monte Carlo and temporal-difference meth...

Doina Precup, Richard S. Sutton, Satinder P. Singh

claim paper

Read More »

128

click to vote

AGENTS
1999
Springer

126views Security Privacy» more AGENTS 1999»

General Principles of Learning-Based Multi-Agent Systems

15 years 5 months ago

Download web.engr.oregonstate.edu

We consider the problem of how to design large decentralized multiagent systems (MAS’s) in an automated fashion, with little or no hand-tuning. Our approach has each agent run a...

David Wolpert, Kevin R. Wheeler, Kagan Tumer

claim paper

Read More »

« Prev « First page 14 / 18 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers