RVRL (Rule Value Reinforcement Learning) is a new algorithm which extends an existing learning framework that models the environment of a situated agent using a probabilistic rule...
Reinforcement learning deals with learning optimal or near optimal policies while interacting with the environment. Application domains with many continuous variables are difficul...
— In this paper, we present an approach that applies the reinforcement learning principle to the problem of learning height control policies for aerial blimps. In contrast to pre...
Axel Rottmann, Christian Plagemann, Peter Hilgers,...
Contextual bandit learning is a reinforcement learning problem where the learner repeatedly receives a set of features (context), takes an action and receives a reward based on th...
We consider reinforcement learning in the parameterized setup, where the model is known to belong to a parameterized family of Markov Decision Processes (MDPs). We further impose ...