We present the first temporal-difference learning algorithm for off-policy control with unrestricted linear function approximation whose per-time-step complexity is linear in the ...
In this paper we propose an approach to address the old problem of identifying the feature conditions under which a gaming strategy can be effective. For doing this, we will build ...
Chad Hogg, Stephen Lee-Urban, Bryan Auslander, H&e...
Typical conversational recommender systems support interactive strategies that are hard-coded in advance and followed rigidly during a recommendation session. In fact, Reinforceme...
In this paper, we propose an architecture for a cognitive robot based on tactile and visual information. Visual information contains various features such as location and area of ...
Abstract. This paper is concerned with designing architectures for rational agents. In the proposed architecture, agents have belief bases that are theories in a multi-modal, highe...