We consider reinforcement learning as solving a Markov decision process with unknown transition distribution. Based on interaction with the environment, an estimate of the transit...
This paper is a proposal for a new two-tier calculus, designed to model aspects of CORBA-like systems at the CORBA evel. The higher object level known as Oompa abstracts away from...
Malcolm Tyrrell, Andrew Butterfield, Alexis Donnel...
We describe a system for the evaluation of the sleep macrostructure on the basis of Emfit sensor foils placed into bed mattress and of advanced signal processing. The signals on wh...
Juha M. Kortelainen, Martin O. Mendez, Anna M. Bia...
Flexible general purpose robots need to tailor their visual processing to their task, on the fly. We propose a new approach to this within a planning framework, where the goal is ...
We consider a variant of the classic multi-armed bandit problem (MAB), which we call FEEDBACK MAB, where the reward obtained by playing each of n independent arms varies according...