We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
Recently, several manifold learning algorithms have been proposed, such as ISOMAP (Tenenbaum et al., 2000), Locally Linear Embedding (Roweis & Saul, 2000), Laplacian Eigenmap ...
Variational inference methods, including mean field methods and loopy belief propagation, have been widely used for approximate probabilistic inference in graphical models. While ...
Conditional Random Fields (CRFs; Lafferty, McCallum, & Pereira, 2001) provide a flexible and powerful model for learning to assign labels to elements of sequences in such appl...
Thomas G. Dietterich, Adam Ashenfelter, Yaroslav B...
The Relevance Vector Machine (RVM) is a sparse approximate Bayesian kernel method. It provides full predictive distributions for test cases. However, the predictive uncertainties ...