Search Sciweavers | Sciweavers

326 search results - page 23 / 66

» Reinforcement Learning Based on On-Line EM Algorithm

179

Voted

GECCO
2006
Springer

208views Optimization» more GECCO 2006»

Comparing evolutionary and temporal difference methods in a reinforcement learning domain

15 years 10 months ago

Download www.cs.bham.ac.uk

Both genetic algorithms (GAs) and temporal difference (TD) methods have proven effective at solving reinforcement learning (RL) problems. However, since few rigorous empirical com...

Matthew E. Taylor, Shimon Whiteson, Peter Stone

claim paper

Read More »

182

click to vote

ICML
2005
IEEE

100views Machine Learning» more ICML 2005»

Reinforcement learning with Gaussian processes

16 years 7 months ago

Download www.machinelearning.org

Gaussian Process Temporal Difference (GPTD) learning offers a Bayesian solution to the policy evaluation problem of reinforcement learning. In this paper we extend the GPTD framew...

Yaakov Engel, Shie Mannor, Ron Meir

claim paper

Read More »

150

click to vote

NIPS
1993

100views Information Technology» more NIPS 1993»

Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach

15 years 8 months ago

Download www.cs.rutgers.edu

This paper describes the Q-routing algorithm for packet routing, in which a reinforcement learning module is embedded into each node of a switching network. Only local communicati...

Justin A. Boyan, Michael L. Littman

claim paper

Read More »

176

Voted

NIPS
2000

127views Information Technology» more NIPS 2000»

Using Free Energies to Represent Q-values in a Multiagent Reinforcement Learning Task

15 years 8 months ago

Download members.chello.at

The problem of reinforcement learning in large factored Markov decision processes is explored. The Q-value of a state-action pair is approximated by the free energy of a product o...

Brian Sallans, Geoffrey E. Hinton

claim paper

Read More »

204

click to vote

PRICAI
2000
Springer

193views Artificial Intelligence» more PRICAI 2000»

Generating Hierarchical Structure in Reinforcement Learning from State Variables

15 years 10 months ago

Download www.csee.umbc.edu

This paper presents the CQ algorithm which decomposes and solves a Markov Decision Process (MDP) by automatically generating a hierarchy of smaller MDPs using state variables. The ...

Bernhard Hengst

claim paper

Read More »

« Prev « First page 23 / 66 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers