We provide an analytical comparison between discounted and average reward temporal-difference (TD) learning with linearly parameterized approximations. We first consider the asympt...
— This paper proposes a high-level Reinforcement Learning (RL) control system for solving the action selection problem of an autonomous robot. Although the dominant approach, whe...
We consider a model of learning Boolean functions from quantum membership queries. This model was studied in [26], where it was shown that any class of Boolean functions which is i...
We consider a resource selection game with incomplete information about the resource-cost functions. All the players know is the set of players, an upper bound on the possible cos...
In this paper we investigate the relation between transfer learning in reinforcement learning with function approximation and supervised learning with concept drift. We present a n...