Sciweavers

2108 search results - page 58 / 422
» Tracking in Reinforcement Learning
Sort
View
NECO
2010
97views more  NECO 2010»
14 years 12 months ago
Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning
Most conventional Policy Gradient Reinforcement Learning (PGRL) algorithms neglect (or do not explicitly make use of) a term in the average reward gradient with respect to the pol...
Tetsuro Morimura, Eiji Uchibe, Junichiro Yoshimoto...
AGI
2011
14 years 5 months ago
Reinforcement Learning and the Bayesian Control Rule
We present an actor-critic scheme for reinforcement learning in complex domains. The main contribution is to show that planning and I/O dynamics can be separated such that an intra...
Pedro Alejandro Ortega, Daniel Alexander Braun, Si...
ICML
2007
IEEE
16 years 2 months ago
Multi-task reinforcement learning: a hierarchical Bayesian approach
We consider the problem of multi-task reinforcement learning, where the agent needs to solve a sequence of Markov Decision Processes (MDPs) chosen randomly from a fixed but unknow...
Aaron Wilson, Alan Fern, Soumya Ray, Prasad Tadepa...
AR
2007
105views more  AR 2007»
15 years 1 months ago
Reinforcement learning of a continuous motor sequence with hidden states
—Reinforcement learning is the scheme for unsupervised learning in which robots are expected to acquire behavior skills through self-explorations based on reward signals. There a...
Hiroaki Arie, Tetsuya Ogata, Jun Tani, Shigeki Sug...
JAIR
2000
131views more  JAIR 2000»
15 years 1 months ago
An Application of Reinforcement Learning to Dialogue Strategy Selection in a Spoken Dialogue System for Email
This paper describes a novel method by which a spoken dialogue system can learn to choose an optimal dialogue strategy from its experience interacting with human users. The method...
Marilyn A. Walker