Partially observable Markov decision processes (POMDPs) are widely used for planning under uncertainty. In many applications, the huge size of the POMDP state space makes straightf...
Joni Pajarinen, Jaakko Peltonen, Ari Hottinen, Mik...
We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...
Robust tracking of abrupt motion is a challenging task
in computer vision due to the large motion uncertainty. In
this paper, we propose a stochastic approximation Monte
Carlo (...
Virtually all methods of learning dynamic systems from data start from the same basic assumption: the learning algorithm will be given a sequence of data generated from the dynami...
We present a speech pre-processing scheme (SPPS) for robust speech recognition in the moving motorcycle environment. The SPPS is dynamically adapted during the run-time operation ...