We discuss vision as a sensory modality for systems that effect actions in response to perceptions. While the internal representations informed by vision may be arbitrarily compl...
Most conventional Policy Gradient Reinforcement Learning (PGRL) algorithms neglect (or do not explicitly make use of) a term in the average reward gradient with respect to the pol...
We propose a new framework for aiding a reinforcement learner by allowing it to relocate, or move, to a state it selects so as to decrease the number of steps it needs to take in ...
Abstract. This paper proposes a novel approach to discover options in the form of conditionally terminating sequences, and shows how they can be integrated into reinforcement learn...
Gaussian Process Temporal Difference (GPTD) learning offers a Bayesian solution to the policy evaluation problem of reinforcement learning. In this paper we extend the GPTD framew...