We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
Abstract. In classical approaches to knowledge representation, reasoners are assumed to derive all the logical consequences of their knowledge base. As a result, reasoning in the ļ...
The goal of approximate policy evaluation is to ābestā represent a target value function according to a speciļ¬c criterion. Temporal difference methods and Bellman residual m...
In this contribution, a novel spatio-temporal prediction algorithm for video coding is introduced. This algorithm exploits temporal as well as spatial redundancies for effectively...
Reasoning about the past is of fundamental importance in several applications in computer science and artiļ¬cial intelligence, including reactive systems and planning. In this pa...