Sciweavers

17 search results - page 2 / 4
» Gradient-Based Relational Reinforcement Learning of Temporal...
Sort
View
ATAL
2009
Springer
14 years 4 days ago
An empirical analysis of value function-based and policy search reinforcement learning
In several agent-oriented scenarios in the real world, an autonomous agent that is situated in an unknown environment must learn through a process of trial and error to take actio...
Shivaram Kalyanakrishnan, Peter Stone
ICML
2000
IEEE
14 years 6 months ago
Eligibility Traces for Off-Policy Policy Evaluation
Eligibility traces have been shown to speed reinforcement learning, to make it more robust to hidden states, and to provide a link between Monte Carlo and temporal-difference meth...
Doina Precup, Richard S. Sutton, Satinder P. Singh
IJCAI
2007
13 years 7 months ago
Utile Distinctions for Relational Reinforcement Learning
We introduce an approach to autonomously creating state space abstractions for an online reinforcement learning agent using a relational representation. Our approach uses a tree-b...
William Dabney, Amy McGovern
NIPS
2001
13 years 7 months ago
Model-Free Least-Squares Policy Iteration
We propose a new approach to reinforcement learning which combines least squares function approximation with policy iteration. Our method is model-free and completely off policy. ...
Michail G. Lagoudakis, Ronald Parr
ATAL
2005
Springer
13 years 11 months ago
Behavior transfer for value-function-based reinforcement learning
Temporal difference (TD) learning methods [22] have become popular reinforcement learning techniques in recent years. TD methods have had some experimental successes and have been...
Matthew E. Taylor, Peter Stone