Sciweavers

69 search results - page 2 / 14
» PAC-Bayesian Policy Evaluation for Reinforcement Learning
Sort
View
JAIR
2002
99views more  JAIR 2002»
13 years 5 months ago
Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System
Designing the dialogue policy of a spoken dialogue system involves many nontrivial choices. This paper presents a reinforcement learning approach for automatically optimizing a di...
Satinder P. Singh, Diane J. Litman, Michael J. Kea...
GECCO
2009
Springer
162views Optimization» more  GECCO 2009»
13 years 3 months ago
Uncertainty handling CMA-ES for reinforcement learning
The covariance matrix adaptation evolution strategy (CMAES) has proven to be a powerful method for reinforcement learning (RL). Recently, the CMA-ES has been augmented with an ada...
Verena Heidrich-Meisner, Christian Igel
ATAL
2009
Springer
14 years 19 hour ago
SarsaLandmark: an algorithm for learning in POMDPs with landmarks
Reinforcement learning algorithms that use eligibility traces, such as Sarsa(λ), have been empirically shown to be effective in learning good estimated-state-based policies in pa...
Michael R. James, Satinder P. Singh
SIGDIAL
2010
13 years 3 months ago
Adaptive Referring Expression Generation in Spoken Dialogue Systems: Evaluation with Real Users
We present new results from a real-user evaluation of a data-driven approach to learning user-adaptive referring expression generation (REG) policies for spoken dialogue systems. ...
Srinivasan Janarthanam, Oliver Lemon
ICML
2001
IEEE
14 years 6 months ago
Off-Policy Temporal Difference Learning with Function Approximation
We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
Doina Precup, Richard S. Sutton, Sanjoy Dasgupta