Although several researchers have integrated methods for reinforcement learning (RL) with case-based reasoning (CBR) to model continuous action spaces, existing integrations typic...
We consider the task of reinforcement learning with linear value function approximation. Temporal difference algorithms, and in particular the Least-Squares Temporal Difference (L...
We call data weakly labeled if it has no exact label but rather a numerical indication of correctness of the label "guessed" by the learning algorithm - a situation comm...
We have recently introduced an incremental learning algorithm, Learn++ .NSE, for Non-Stationary Environments, where the data distribution changes over time due to concept drift. Le...
We consider the policy search approach to reinforcement learning. We show that if a “baseline distribution” is given (indicating roughly how often we expect a good policy to v...
J. Andrew Bagnell, Sham Kakade, Andrew Y. Ng, Jeff...