Sciweavers

132 search results - page 10 / 27
» Generalization in Reinforcement Learning: Safely Approximati...
Sort
View
ATAL
2008
Springer
15 years 1 months ago
Sigma point policy iteration
In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...
Michael H. Bowling, Alborz Geramifard, David Winga...
ECAI
2008
Springer
15 years 1 months ago
Reinforcement Learning with the Use of Costly Features
In many practical reinforcement learning problems, the state space is too large to permit an exact representation of the value function, much less the time required to compute it. ...
Robby Goetschalckx, Scott Sanner, Kurt Driessens
91
Voted
ICML
2005
IEEE
16 years 12 days ago
Proto-value functions: developmental reinforcement learning
This paper presents a novel framework called proto-reinforcement learning (PRL), based on a mathematical model of a proto-value function: these are task-independent basis function...
Sridhar Mahadevan
ICML
2003
IEEE
15 years 4 months ago
The Significance of Temporal-Difference Learning in Self-Play Training TD-Rummy versus EVO-rummy
Reinforcement learning has been used for training game playing agents. The value function for a complex game must be approximated with a continuous function because the number of ...
Clifford Kotnik, Jugal K. Kalita
95
Voted
IJCAI
2007
15 years 1 months ago
General Game Learning Using Knowledge Transfer
We present a reinforcement learning game player that can interact with a General Game Playing system and transfer knowledge learned in one game to expedite learning in many other ...
Bikramjit Banerjee, Peter Stone