Sciweavers

4 search results - page 1 / 1
» Temporal Difference Bayesian Model Averaging: A Bayesian Per...
Sort
View
ICML
2010
IEEE
13 years 2 months ago
Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda
Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...
Carlton Downey, Scott Sanner
ICML
2007
IEEE
14 years 5 months ago
Bayesian actor-critic algorithms
We1 present a new actor-critic learning model in which a Bayesian class of non-parametric critics, using Gaussian process temporal difference learning is used. Such critics model ...
Mohammad Ghavamzadeh, Yaakov Engel
CORR
2010
Springer
118views Education» more  CORR 2010»
13 years 4 months ago
Large scale probabilistic available bandwidth estimation
The common utilization-based definition of available bandwidth and many of the existing tools to estimate it suffer from several important weaknesses: i) most tools report a point...
Frederic Thouin, Mark Coates, Michael G. Rabbat
ATAL
2010
Springer
13 years 6 months ago
Planning against fictitious players in repeated normal form games
Planning how to interact against bounded memory and unbounded memory learning opponents needs different treatment. Thus far, however, work in this area has shown how to design pla...
Enrique Munoz de Cote, Nicholas R. Jennings