Search Sciweavers | Sciweavers

813 search results - page 132 / 163

» Ensemble Algorithms in Reinforcement Learning

132

click to vote

ECML
2007
Springer

192views Machine Learning» more ECML 2007»

Policy Gradient Critics

15 years 10 months ago

Download www.idsia.ch

We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...

Daan Wierstra, Jürgen Schmidhuber

claim paper

Read More »

124

Voted

CEEMAS
2005
Springer

87views Intelligent Agents» more CEEMAS 2005»

A Direct Reputation Model for VO Formation

15 years 9 months ago

Download www.dcs.kcl.ac.uk

We show that reputation is a basic ingredient in the Virtual Organisation (VO) formation process. Agents can use their experiences gained in direct past interactions to model other...

Arturo Avila-Rosas, Michael Luck

claim paper

Read More »

117

Voted

ICRA
1994
IEEE

105views Robotics» more ICRA 1994»

Harmonic Functions and Collision Probabilities

15 years 8 months ago

Download www.cs.cmu.edu

There is a close relationship between harmonic functions { which have recently been proposed for path planning { and hitting probabilities for random processes. The hitting probab...

Christopher I. Connolly

claim paper

Read More »

126

click to vote

ESANN
2008

125views Neural Networks» more ESANN 2008»

Improvement in Game Agent Control Using State-Action Value Scaling

15 years 5 months ago

Download www.dice.ucl.ac.be

The aim of this paper is to enhance the performance of a reinforcement learning game agent controller, within a dynamic game environment, through the retention of learned informati...

Leo Galway, Darryl Charles, Michaela M. Black

claim paper

Read More »

132

click to vote

ESANN
2004

90views Neural Networks» more ESANN 2004»

High-accuracy value-function approximation with neural networks applied to the acrobot

15 years 5 months ago

Download remi.coulom.free.fr

Several reinforcement-learning techniques have already been applied to the Acrobot control problem, using linear function approximators to estimate the value function. In this pape...

Rémi Coulom

claim paper

Read More »

« Prev « First page 132 / 163 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers