Abstract— Q-learning is a technique used to compute an optimal policy for a controlled Markov chain based on observations of the system controlled using a non-optimal policy. It ...
Grouping based on common motion, or “common fate” provides a powerful cue for segmenting image sequences. Recently a number of algorithms have been developed that successfully...
In this paper, we study the simultaneousdriver and wire sizing (SDWS) problem under two objective functions: (i) delay minimization only, or (ii) combined delay and power dissipat...
Functional magnetic resonance imaging (fMRI) data were collected while students worked with a tutoring system that taught an algebra isomorph. A cognitive model predicted the distr...
Jon M. Fincham, John R. Anderson, Shawn Betts, Jen...
We propose a new approach to reinforcement learning which combines least squares function approximation with policy iteration. Our method is model-free and completely off policy. ...