In a cognitive radio network, the full-spectrum is usually divided into multiple channels. However, due to the hardware and energy constraints, a cognitive user (also called second...
We consider the task of reinforcement learning in an environment in which rare significant events occur independently of the actions selected by the controlling agent. If these ev...
We develop a new graphical representation for interactive partially observable Markov decision processes (I-POMDPs) that is significantly more transparent and semantically clear t...
Abstract. In the aftermath of a large-scale disaster, agents’ decisions derive from self-interested (e.g. survival), common-good (e.g. victims’ rescue) and teamwork (e.g. fire...
Dynamic Programming, Q-learning and other discrete Markov Decision Process solvers can be applied to continuous d-dimensional state-spaces by quantizing the state space into an arr...