Eligibility traces have been shown to speed reinforcement learning, to make it more robust to hidden states, and to provide a link between Monte Carlo and temporal-difference meth...
Doina Precup, Richard S. Sutton, Satinder P. Singh
We present the first temporal-difference learning algorithm for off-policy control with unrestricted linear function approximation whose per-time-step complexity is linear in the ...
Agent technology provides many services to users. The tasks in which agents are involved include information filtering, information retrieval, user's tasks automation, browsin...
Abstract. Learning to act in an unknown partially observable domain is a difficult variant of the reinforcement learning paradigm. Research in the area has focused on model-free m...
The analysis of spectral data constitutes new challenges for machine learning algorithms due to the functional nature of the data. Special attention is paid to the metric used in t...
Petra Schneider, Frank-Michael Schleif, Thomas Vil...