Sciweavers

30 search results - page 5 / 6
» Model-Based Average Reward Reinforcement Learning
Sort
View
NIPS
2001
13 years 6 months ago
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
NN
2002
Springer
13 years 5 months ago
Opponent interactions between serotonin and dopamine
Anatomical and pharmacological evidence suggests that the dorsal raphe serotonin system and the ventral tegmental and substantia nigra dopamine system may act as mutual opponents....
Nathaniel D. Daw, Sham Kakade, Peter Dayan
ECAL
2007
Springer
13 years 11 months ago
Guided Self-organisation for Autonomous Robot Development
Abstract. The paper presents a method to guide the self-organised development of behaviours of autonomous robots. In earlier publications we demonstrated how to use the homeokinesi...
Georg Martius, J. Michael Herrmann, Ralf Der
ACL
2008
13 years 7 months ago
Learning Effective Multimodal Dialogue Strategies from Wizard-of-Oz Data: Bootstrapping and Evaluation
We address two problems in the field of automatic optimization of dialogue strategies: learning effective dialogue strategies when no initial data or system exists, and evaluating...
Verena Rieser, Oliver Lemon
BROADNETS
2004
IEEE
13 years 9 months ago
Efficient QoS Provisioning for Adaptive Multimedia in Mobile Communication Networks by Reinforcement Learning
The scarcity and large fluctuations of link bandwidth in wireless networks have motivated the development of adaptive multimedia services in mobile communication networks, where i...
Fei Yu, Vincent W. S. Wong, Victor C. M. Leung