Search Sciweavers | Sciweavers

38 search results - page 2 / 8

» The utility of temporal abstraction in reinforcement learnin...

click to vote

CIG
2005
IEEE

120views Applied Computing» more CIG 2005»

Adapting Reinforcement Learning for Computer Games: Using Group Utility Functions

13 years 10 months ago

Download cswww.essex.ac.uk

AbstractGroup utility functions are an extension of the common team utility function for providing multiple agents with a common reinforcement learning signal for learning cooperat...

Jay Bradley, Gillian Hayes

claim paper

Read More »

click to vote

NIPS
2008

173views Information Technology» more NIPS 2008»

On the asymptotic equivalence between differential Hebbian and temporal difference learning using a local third factor

13 years 6 months ago

Download books.nips.cc

In this theoretical contribution we provide mathematical proof that two of the most important classes of network learning - correlation-based differential Hebbian learning and rew...

Christoph Kolodziejski, Bernd Porr, Minija Tamosiu...

claim paper

Read More »

click to vote

ICML
1998
IEEE

165views Machine Learning» more ICML 1998»

Intra-Option Learning about Temporally Abstract Actions

14 years 5 months ago

Download www.cs.ualberta.ca

tion Learning about Temporally Abstract Actions Richard S. Sutton Department of Computer Science University of Massachusetts Amherst, MA 01003-4610 rich@cs.umass.edu Doina Precup D...

Richard S. Sutton, Doina Precup, Satinder P. Singh

claim paper

Read More »

click to vote

ICML
2002
IEEE

155views Machine Learning» more ICML 2002»

Discovering Hierarchy in Reinforcement Learning with HEXQ

14 years 5 months ago

Download www.cs.berkeley.edu

An open problem in reinforcement learning is discovering hierarchical structure. HEXQ, an algorithm which automatically attempts to decompose and solve a model-free factored MDP h...

Bernhard Hengst

claim paper

Read More »

click to vote

ICML
2001
IEEE

185views Machine Learning» more ICML 2001»

Off-Policy Temporal Difference Learning with Function Approximation

14 years 5 months ago

Download www.cs.ualberta.ca

We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...

Doina Precup, Richard S. Sutton, Sanjoy Dasgupta

claim paper

Read More »

« Prev « First page 2 / 8 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers