This paper addresses the issue of policy evaluation in Markov Decision Processes, using linear function approximation. It provides a unified view of algorithms such as TD(), LSTD()...
Given a set of n points S in the plane, a triangulation of S is a subdivision of the convex hull into triangles whose vertices are from S. In the kinetic setting, the input point ...
This paper focuses on the assignment of discrete points among K robots and determining the order in which the points should be processed by the robots, in the presence of geometric...
Nilanjan Chakraborty, Srinivas Akella, John T. Wen
In the multi-armed bandit (MAB) problem there are k distributions associated with the rewards of playing each of k strategies (slot machine arms). The reward distributions are ini...
Abstract. Bayesian reinforcement learning (RL) is aimed at making more efficient use of data samples, but typically uses significantly more computation. For discrete Markov Decis...