The creation of Answer Communities around a FAQs Site is proposed to speed up the process of answering questions. Our approach combines long-term and short-term rewards. Long-term ...
We propose an active vision system for object acquisition. The core of our approach is a reinforcement learning module which learns a strategy to scan an object. The agent moves a...
Gabriele Peters, Claus-Peter Alberts, Markus Bries...
In this paper, we propose a novel adaptive step-size approach for policy gradient reinforcement learning. A new metric is defined for policy gradients that measures the effect of ...
Takamitsu Matsubara, Tetsuro Morimura, Jun Morimot...
We consider a variant of the classic multi-armed bandit problem (MAB), which we call FEEDBACK MAB, where the reward obtained by playing each of n independent arms varies according...