Several researchers have shown that the efficiency of value iteration, a dynamic programming algorithm for Markov decision processes, can be improved by prioritizing the order of...
Reinforcement learning (RL) was originally proposed as a framework to allow agents to learn in an online fashion as they interact with their environment. Existing RL algorithms co...
Pascal Poupart, Nikos A. Vlassis, Jesse Hoey, Kevi...
We address the problem of automatically constructing basis functions for linear approximation of the value function of a Markov Decision Process (MDP). Our work builds on results ...
We study an approach to policy selection for large relational Markov Decision Processes (MDPs). We consider a variant of approximate policy iteration (API) that replaces the usual...
Abstract--The difficulties encountered in sequential decisionmaking problems under uncertainty are often linked to the large size of the state space. Exploiting the structure of th...