We consider approximate policy evaluation for finite state and action Markov decision processes (MDP) in the off-policy learning context and with the simulation-based least square...
In many applications, unlabelled examples are inexpensive and easy to obtain. Semisupervised approaches try to utilise such examples to reduce the predictive error. In this paper,...
We propose a new approach to reinforcement learning which combines least squares function approximation with policy iteration. Our method is model-free and completely off policy. ...
This paper considers the regularized learning algorithm associated with the leastsquare loss and reproducing kernel Hilbert spaces. The target is the error analysis for the regres...
An adaptive and iterative LSSVR algorithm based on quadratic Renyi entropy is presented in this paper. LS-SVM loses the sparseness of support vector which is one of the important ...