We describe a system that successfully transfers value function knowledge across multiple subdomains of realtime strategy games in the context of multiagent reinforcement learning....
Partially observable Markov decision processes (POMDPs) have been
successfully applied to various robot motion planning tasks under uncertainty.
However, most existing POMDP algo...
Haoyu Bai, David Hsu, Wee Sun Lee, and Vien A. Ngo
The present paper introduces a new model for teaching randomized learners. Our new model, though based on the classical teaching dimension model, allows to study the influence of...
Abstract—The problem of cognitive access of channels of primary users by a secondary user is considered. The transmissions of primary users are modeled as independent continuous-...
In this paper, we propose a novel adaptive step-size approach for policy gradient reinforcement learning. A new metric is defined for policy gradients that measures the effect of ...
Takamitsu Matsubara, Tetsuro Morimura, Jun Morimot...