Sciweavers

30 search results - page 5 / 6
» Model-Based Average Reward Reinforcement Learning
Sort
View
103
Voted
NIPS
2001
15 years 2 months ago
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
84
Voted
NN
2002
Springer
15 years 24 days ago
Opponent interactions between serotonin and dopamine
Anatomical and pharmacological evidence suggests that the dorsal raphe serotonin system and the ventral tegmental and substantia nigra dopamine system may act as mutual opponents....
Nathaniel D. Daw, Sham Kakade, Peter Dayan
96
Voted
ECAL
2007
Springer
15 years 7 months ago
Guided Self-organisation for Autonomous Robot Development
Abstract. The paper presents a method to guide the self-organised development of behaviours of autonomous robots. In earlier publications we demonstrated how to use the homeokinesi...
Georg Martius, J. Michael Herrmann, Ralf Der
109
Voted
ACL
2008
15 years 2 months ago
Learning Effective Multimodal Dialogue Strategies from Wizard-of-Oz Data: Bootstrapping and Evaluation
We address two problems in the field of automatic optimization of dialogue strategies: learning effective dialogue strategies when no initial data or system exists, and evaluating...
Verena Rieser, Oliver Lemon
142
Voted
BROADNETS
2004
IEEE
15 years 4 months ago
Efficient QoS Provisioning for Adaptive Multimedia in Mobile Communication Networks by Reinforcement Learning
The scarcity and large fluctuations of link bandwidth in wireless networks have motivated the development of adaptive multimedia services in mobile communication networks, where i...
Fei Yu, Vincent W. S. Wong, Victor C. M. Leung