Policy gradient (PG) reinforcement learning algorithms have strong (local) convergence guarantees, but their learning performance is typically limited by a large variance in the e...
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
Using a distributed algorithm rather than a centralized one can be extremely beneficial in large search problems. In addition, the incorporation of machine learning techniques lik...
Reinforcement techniques have been successfully used to maximise the expected cumulative reward of statistical dialogue systems. Typically, reinforcement learning is used to estim...
This paper proposes a mechanism of noise tolerance for reinforcement learning algorithms. An adaptive agent that employs reinforcement learning algorithms may receive and accumula...
Richardson Ribeiro, Alessandro L. Koerich, Fabr&ia...