Sciweavers

779 search results - page 78 / 156
» Reinforcement Using Supervised Learning for Policy Generaliz...
Sort
View
119
Voted
ICML
2003
IEEE
16 years 3 months ago
Exploration in Metric State Spaces
We present metric?? , a provably near-optimal algorithm for reinforcement learning in Markov decision processes in which there is a natural metric on the state space that allows t...
Sham Kakade, Michael J. Kearns, John Langford
82
Voted
ICML
2009
IEEE
16 years 3 months ago
A majorization-minimization algorithm for (multiple) hyperparameter learning
We present a general Bayesian framework for hyperparameter tuning in L2-regularized supervised learning models. Paradoxically, our algorithm works by first analytically integratin...
Chuan-Sheng Foo, Chuong B. Do, Andrew Y. Ng
132
Voted
COLT
2010
Springer
15 years 17 days ago
An Asymptotically Optimal Bandit Algorithm for Bounded Support Models
Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...
Junya Honda, Akimichi Takemura
133
Voted
IJCAI
2007
15 years 4 months ago
Transfer Learning in Real-Time Strategy Games Using Hybrid CBR/RL
The goal of transfer learning is to use the knowledge acquired in a set of source tasks to improve performance in a related but previously unseen target task. In this paper, we pr...
Manu Sharma, Michael P. Holmes, Juan Carlos Santam...
115
Voted
EPIA
2003
Springer
15 years 7 months ago
Adaptation to Drifting Concepts
Most of supervised learning algorithms assume the stability of the target concept over time. Nevertheless in many real-user modeling systems, where the data is collected over an ex...
Gladys Castillo, João Gama, Pedro Medas