Sciweavers

161 search results - page 26 / 33
» Convergence Problems of General-Sum Multiagent Reinforcement...
Sort
View
90
Voted
ICML
2010
IEEE
14 years 7 months ago
Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda
Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...
Carlton Downey, Scott Sanner
ICML
2001
IEEE
15 years 10 months ago
Off-Policy Temporal Difference Learning with Function Approximation
We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
Doina Precup, Richard S. Sutton, Sanjoy Dasgupta
ATAL
2008
Springer
14 years 11 months ago
Sequential decision making with untrustworthy service providers
In this paper, we deal with the sequential decision making problem of agents operating in computational economies, where there is uncertainty regarding the trustworthiness of serv...
W. T. Luke Teacy, Georgios Chalkiadakis, Alex Roge...
AAAI
2010
14 years 11 months ago
Relative Entropy Policy Search
Policy search is a successful approach to reinforcement learning. However, policy improvements often result in the loss of information. Hence, it has been marred by premature conv...
Jan Peters, Katharina Mülling, Yasemin Altun
ATAL
2009
Springer
15 years 4 months ago
Adaptive learning in evolving task allocation networks
In this paper, we study multi-agent economic systems using a recent approach to economic modeling called Agent-based Computational Economics (ACE): the application of the Complex ...
Tomas Klos, Bart Nooteboom