Planning problems are often formulated as heuristic search. The choice of the heuristic function plays a significant role in the performance of planning systems, but a good heuris...
Approximate linear programming (ALP) offers a promising framework for solving large factored Markov decision processes (MDPs) with both discrete and continuous states. Successful ...
Policy teaching considers a Markov Decision Process setting in which an interested party aims to influence an agent’s decisions by providing limited incentives. In this paper, ...
Abstract— Least-squares policy iteration is a useful reinforcement learning method in robotics due to its computational efficiency. However, it tends to be sensitive to outliers...
Appropriately designing sampling policies is highly important for obtaining better control policies in reinforcement learning. In this paper, we first show that the least-squares ...