We present an improvement of Noviko 's perceptron convergence theorem. Reinterpreting this mistakebound as a margindependent sparsity guarantee allows us to give a PAC{style ...
Thore Graepel, Ralf Herbrich, Robert C. Williamson
Abstract. We consider Reinforcement Learning for average reward zerosum stochastic games. We present and analyze two algorithms. The first is based on relative Q-learning and the ...
In this paper we obtain convergence bounds for the concentration of Bayesian posterior distributions (around the true distribution) using a novel method that simplifies and enhan...
Several multiagent reinforcement learning (MARL) algorithms have been proposed to optimize agents' decisions. Due to the complexity of the problem, the majority of the previo...
A number of reinforcement learning algorithms have been developed that are guaranteed to converge to the optimal solution when used with lookup tables. It is shown, however, that ...