We study decision making in environments where the reward is only partially observed, but can be modeled as a function of an action and an observed context. This setting, known as...
Eligibility traces have been shown to speed reinforcement learning, to make it more robust to hidden states, and to provide a link between Monte Carlo and temporal-difference meth...
Doina Precup, Richard S. Sutton, Satinder P. Singh
Direct policy search is a practical way to solve reinforcement learning problems involving continuous state and action spaces. The goal becomes finding policy parameters that maxi...
The covariance matrix adaptation evolution strategy (CMAES) has proven to be a powerful method for reinforcement learning (RL). Recently, the CMA-ES has been augmented with an ada...
— Reinforcement learning (RL) is one of the most general approaches to learning control. Its applicability to complex motor systems, however, has been largely impossible so far d...