Sciweavers

162 search results - page 18 / 33
» Off-Policy Temporal Difference Learning with Function Approx...
Sort
View
IJCAI
2007
14 years 11 months ago
Reinforcement Learning of Local Shape in the Game of Go
We explore an application to the game of Go of a reinforcement learning approach based on a linear evaluation function and large numbers of binary features. This strategy has prov...
David Silver, Richard S. Sutton, Martin Mülle...
ICMLA
2007
14 years 11 months ago
Control of a re-entrant line manufacturing model with a reinforcement learning approach
This paper presents the application of a reinforcement learning (RL) approach for the near-optimal control of a re-entrant line manufacturing (RLM) model. The RL approach utilizes...
José A. Ramírez-Hernández, Em...
ATAL
2009
Springer
15 years 4 months ago
An empirical analysis of value function-based and policy search reinforcement learning
In several agent-oriented scenarios in the real world, an autonomous agent that is situated in an unknown environment must learn through a process of trial and error to take actio...
Shivaram Kalyanakrishnan, Peter Stone
ICML
2007
IEEE
15 years 10 months ago
Bayesian actor-critic algorithms
We1 present a new actor-critic learning model in which a Bayesian class of non-parametric critics, using Gaussian process temporal difference learning is used. Such critics model ...
Mohammad Ghavamzadeh, Yaakov Engel
ML
2007
ACM
106views Machine Learning» more  ML 2007»
14 years 9 months ago
Surrogate maximization/minimization algorithms and extensions
Abstract Surrogate maximization (or minimization) (SM) algorithms are a family of algorithms that can be regarded as a generalization of expectation-maximization (EM) algorithms. A...
Zhihua Zhang, James T. Kwok, Dit-Yan Yeung