Sciweavers

3084 search results - page 111 / 617
» Learning to Take Actions
Sort
View
ML
2002
ACM
121views Machine Learning» more  ML 2002»
14 years 9 months ago
Near-Optimal Reinforcement Learning in Polynomial Time
We present new algorithms for reinforcement learning, and prove that they have polynomial bounds on the resources required to achieve near-optimal return in general Markov decisio...
Michael J. Kearns, Satinder P. Singh
NIPS
2000
14 years 11 months ago
Using Free Energies to Represent Q-values in a Multiagent Reinforcement Learning Task
The problem of reinforcement learning in large factored Markov decision processes is explored. The Q-value of a state-action pair is approximated by the free energy of a product o...
Brian Sallans, Geoffrey E. Hinton
NN
2002
Springer
113views Neural Networks» more  NN 2002»
14 years 9 months ago
Control of exploitation-exploration meta-parameter in reinforcement learning
In reinforcement learning (RL), the duality between exploitation and exploration has long been an important issue. This paper presents a new method that controls the balance betwe...
Shin Ishii, Wako Yoshida, Junichiro Yoshimoto
EWRL
2008
14 years 11 months ago
Optimistic Planning of Deterministic Systems
If one possesses a model of a controlled deterministic system, then from any state, one may consider the set of all possible reachable states starting from that state and using any...
Jean-François Hren, Rémi Munos
COLT
2010
Springer
14 years 8 months ago
Open Loop Optimistic Planning
We consider the problem of planning in a stochastic and discounted environment with a limited numerical budget. More precisely, we investigate strategies exploring the set of poss...
Sébastien Bubeck, Rémi Munos