— When an agent observes its environment, there are two important characteristics of the perceived information. One is the relevance of information and the other is redundancy. T...
Zhihui Luo, David A. Bell, Barry McCollum, Qingxia...
Some of the most successful recent applications of reinforcement learning have used neural networks and the TD algorithm to learn evaluation functions. In this paper, we examine t...
Repair or error-recovery strategies are an important design issue in Spoken Dialogue Systems (SDSs) - how to conduct the dialogue when there is no progress (e.g. due to repeated A...
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Abstract In this paper we address the problem of simultaneous learning and coordination in multiagent Markov decision problems (MMDPs) with infinite state-spaces. We separate this ...