—Pose problem is a big challenge for applying face recognition technology under real world conditions. In this paper, appearance based approach was proposed to recognize face acr...
Hongzhou Zhang, Yongping Li, Lin Wang, Chengbo Wan...
Residual gradient (RG) was proposed as an alternative to TD(0) for policy evaluation when function approximation is used, but there exists little formal analysis comparing them ex...
Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...
We give the first rigorous upper bounds on the error of temporal difference (td) algorithms for policy evaluation as a function of the amount of experience. These upper bounds pr...