This paper describes an investigation into the refinement of context-based human behavior models through the use of experiential learning. Specifically, a tactical agent was endow...
In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...
Michael H. Bowling, Alborz Geramifard, David Winga...
We consider the task of learning to accurately follow a trajectory in a vehicle such as a car or helicopter. A number of dynamic programming algorithms such as Differential Dynami...
J. Zico Kolter, Adam Coates, Andrew Y. Ng, Yi Gu, ...
In this paper we present a general, flexible framework for learning mappings from images to actions by interacting with the environment. The basic idea is to introduce a feature-...
In this paper, we investigate the hypothesis that plan recognition can significantly improve the performance of a casebased reinforcement learner in an adversarial action selectio...