Sciweavers

5 search results - page 1 / 1
» Expected Mistake Bound Model for On-Line Reinforcement Learn...
Sort
View
EWRL
2008
13 years 5 months ago
Efficient Reinforcement Learning in Parameterized Models: Discrete Parameter Case
We consider reinforcement learning in the parameterized setup, where the model is known to belong to a parameterized family of Markov Decision Processes (MDPs). We further impose ...
Kirill Dyagilev, Shie Mannor, Nahum Shimkin
CORR
2000
Springer
92views Education» more  CORR 2000»
13 years 3 months ago
Predicting the expected behavior of agents that learn about agents: the CLRI framework
We describe a framework and equations used to model and predict the behavior of multi-agent systems (MASs) with learning agents. A difference equation is used for calculating the ...
José M. Vidal, Edmund H. Durfee
ML
2008
ACM
152views Machine Learning» more  ML 2008»
13 years 3 months ago
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
Abstract. We consider batch reinforcement learning problems in continuous space, expected total discounted-reward Markovian Decision Problems. As opposed to previous theoretical wo...
András Antos, Csaba Szepesvári, R&ea...
ICML
2009
IEEE
14 years 4 months ago
Near-Bayesian exploration in polynomial time
We consider the exploration/exploitation problem in reinforcement learning (RL). The Bayesian approach to model-based RL offers an elegant solution to this problem, by considering...
J. Zico Kolter, Andrew Y. Ng