There is increasing research interest in solving routing problems in sensor networks subject to constraints such as data correlation, link reliability and energy conservation. Sin...
Gaussian Process Temporal Difference (GPTD) learning offers a Bayesian solution to the policy evaluation problem of reinforcement learning. In this paper we extend the GPTD framew...
We address two open theoretical questions in Policy Gradient Reinforcement Learning. The first concerns the efficacy of using function approximation to represent the state action ...
— This paper presents a new reinforcement learning algorithm for accelerating acquisition of new skills by real mobile robots, without requiring simulation. It speeds up Q-learni...
Reinforcement learning in real-world domains suffers from three curses of dimensionality: explosions in state and action spaces, and high stochasticity. We present approaches that ...