We consider an opportunistic spectrum access (OSA) problem where the time-varying condition of each channel (e.g., as a result of random fading or certain primary users' activ...
The online learning problem requires a player to iteratively choose an action in an unknown and changing environment. In the standard setting of this problem, the player has to ch...
— Reinforcement Learning (RL) provides a promising new approach to systems performance management that differs radically from standard queuing-theoretic approaches making use of ...
Gerald Tesauro, Nicholas K. Jong, Rajarshi Das, Mo...
We develop a novel mechanism for coordinated, distributed multiagent planning. We consider problems stated as a collection of single-agent planning problems coupled by common soft...
Some applications have to present their results in the form of ranked lists. This is the case of many information retrieval applications, in which documents must be sorted accordi...
Adriano Veloso, Humberto Mossri de Almeida, Marcos...