E-blended learning as a new methodology will be explained. E-blended learning scenario for distance learners will include live sessions. During the last years we developed e-learn...
We present the first temporal-difference learning algorithm for off-policy control with unrestricted linear function approximation whose per-time-step complexity is linear in the ...
Eligibility traces have been shown to speed reinforcement learning, to make it more robust to hidden states, and to provide a link between Monte Carlo and temporal-difference meth...
Doina Precup, Richard S. Sutton, Satinder P. Singh
This paper describes ActionStreams, a system for inducing task models from observations of user activity. The model can represent several task structures: hierarchy, variable sequ...
Hierarchical reinforcement learning is a general framework which attempts to accelerate policy learning in large domains. On the other hand, policy gradient reinforcement learning...