Search Sciweavers | Sciweavers

38 search results - page 2 / 8

» The utility of temporal abstraction in reinforcement learnin...

click to vote

CIG
2005
IEEE

120views Applied Computing» more CIG 2005»

Adapting Reinforcement Learning for Computer Games: Using Group Utility Functions

15 years 5 months ago

Download cswww.essex.ac.uk

AbstractGroup utility functions are an extension of the common team utility function for providing multiple agents with a common reinforcement learning signal for learning cooperat...

Jay Bradley, Gillian Hayes

claim paper

Read More »

100

Voted

NIPS
2008

173views Information Technology» more NIPS 2008»

On the asymptotic equivalence between differential Hebbian and temporal difference learning using a local third factor

15 years 1 months ago

Download books.nips.cc

In this theoretical contribution we provide mathematical proof that two of the most important classes of network learning - correlation-based differential Hebbian learning and rew...

Christoph Kolodziejski, Bernd Porr, Minija Tamosiu...

claim paper

Read More »

Voted

ICML
1998
IEEE

165views Machine Learning» more ICML 1998»

Intra-Option Learning about Temporally Abstract Actions

16 years 13 days ago

Download www.cs.ualberta.ca

tion Learning about Temporally Abstract Actions Richard S. Sutton Department of Computer Science University of Massachusetts Amherst, MA 01003-4610 rich@cs.umass.edu Doina Precup D...

Richard S. Sutton, Doina Precup, Satinder P. Singh

claim paper

Read More »

click to vote

ICML
2002
IEEE

155views Machine Learning» more ICML 2002»

Discovering Hierarchy in Reinforcement Learning with HEXQ

16 years 13 days ago

Download www.cs.berkeley.edu

An open problem in reinforcement learning is discovering hierarchical structure. HEXQ, an algorithm which automatically attempts to decompose and solve a model-free factored MDP h...

Bernhard Hengst

claim paper

Read More »

106

click to vote

ICML
2001
IEEE

185views Machine Learning» more ICML 2001»

Off-Policy Temporal Difference Learning with Function Approximation

16 years 13 days ago

Download www.cs.ualberta.ca

We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...

Doina Precup, Richard S. Sutton, Sanjoy Dasgupta

claim paper

Read More »

« Prev « First page 2 / 8 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers