Sciweavers

223 search results - page 32 / 45
» Least-Squares Temporal Difference Learning
Sort
View
JMLR
2002
100views more  JMLR 2002»
14 years 9 months ago
On the Convergence of Optimistic Policy Iteration
We consider a finite-state Markov decision problem and establish the convergence of a special case of optimistic policy iteration that involves Monte Carlo estimation of Q-values,...
John N. Tsitsiklis
FLAIRS
2004
14 years 11 months ago
On the Pedagogically Guided Paper Recommendation for an Evolving Web-Based Learning System
In this paper we discuss the mechanism of a recommender system recommending papers for an evolving web-based learning system. Our system is unique in three aspects. The first is t...
Tiffany Ya Tang, Gordon I. McCalla
ICPR
2006
IEEE
15 years 10 months ago
Robust Recursive Learning for Foreground Region Detection in Videos with Quasi-Stationary Backgrounds
Detecting regions of interest in video sequences is the most important task in many high level video processing applications. In this paper a robust technique based on recursive l...
Alireza Tavakkoli, George Bebis, Mircea Nicolescu
AAAI
2006
14 years 11 months ago
Sample-Efficient Evolutionary Function Approximation for Reinforcement Learning
Reinforcement learning problems are commonly tackled with temporal difference methods, which attempt to estimate the agent's optimal value function. In most real-world proble...
Shimon Whiteson, Peter Stone
CVPR
2009
IEEE
16 years 5 months ago
Learning sign language by watching TV (using weakly aligned subtitles)
The goal of this work is to automatically learn a large number of British Sign Language (BSL) signs from TV broadcasts. We achieve this by using the supervisory information avai...
Patrick Buehler (University of Oxford), Mark Everi...