Abstract--Proper admission control in cognitive radio networks is critical in providing QoS guarantees to secondary unlicensed users. In this paper, we study the admission control ...
Howard's policy iteration algorithm is one of the most widely used algorithms for finding optimal policies for controlling Markov Decision Processes (MDPs). When applied to we...
Basis functions derived from an undirected graph connecting nearby samples from a Markov decision process (MDP) have proven useful for approximating value functions. The success o...
Hierarchical reinforcement learning (RL) is a general framework which studies how to exploit the structure of actions and tasks to accelerate policy learning in large domains. Pri...
Abstract. Several schemes have been proposed for compactly representing multiattribute utility functions, yet none seems to achieve the level of success achieved by Bayesian and Ma...