We present an information theoretic approach for learning a linear dimension reduction transform for object classification. The theoretic guidance of the approach is that the trans...
In this paper we unify two supposedly distinct tasks in multimedia retrieval. One task involves answering queries with a few examples. The other involves learning models for seman...
Eligibility traces have been shown to speed reinforcement learning, to make it more robust to hidden states, and to provide a link between Monte Carlo and temporal-difference meth...
Doina Precup, Richard S. Sutton, Satinder P. Singh
Abstract. We develop a new error bound for transductive learning algorithms. The slack term in the new bound is a function of a relaxed notion of transductive stability, which meas...
The generalization of policies in reinforcement learning is a main issue, both from the theoretical model point of view and for their applicability. However, generalizing from a se...