Sciweavers

813 search results - page 132 / 163
» Ensemble Algorithms in Reinforcement Learning
Sort
View
ECML
2007
Springer
15 years 10 months ago
Policy Gradient Critics
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Daan Wierstra, Jürgen Schmidhuber
124
Voted
CEEMAS
2005
Springer
15 years 9 months ago
A Direct Reputation Model for VO Formation
We show that reputation is a basic ingredient in the Virtual Organisation (VO) formation process. Agents can use their experiences gained in direct past interactions to model other...
Arturo Avila-Rosas, Michael Luck
117
Voted
ICRA
1994
IEEE
105views Robotics» more  ICRA 1994»
15 years 8 months ago
Harmonic Functions and Collision Probabilities
There is a close relationship between harmonic functions { which have recently been proposed for path planning { and hitting probabilities for random processes. The hitting probab...
Christopher I. Connolly
ESANN
2008
15 years 5 months ago
Improvement in Game Agent Control Using State-Action Value Scaling
The aim of this paper is to enhance the performance of a reinforcement learning game agent controller, within a dynamic game environment, through the retention of learned informati...
Leo Galway, Darryl Charles, Michaela M. Black
ESANN
2004
15 years 5 months ago
High-accuracy value-function approximation with neural networks applied to the acrobot
Several reinforcement-learning techniques have already been applied to the Acrobot control problem, using linear function approximators to estimate the value function. In this pape...
Rémi Coulom