We introduce relational temporal difference learning as an effective approach to solving multi-agent Markov decision problems with large state spaces. Our algorithm uses temporal ...
Actor-Critic based approaches were among the first to address reinforcement learning in a general setting. Recently, these algorithms have gained renewed interest due to their gen...
We target the problem of closed-loop learning of control policies that map visual percepts to continuous actions. Our algorithm, called Reinforcement Learning of Joint Classes (RLJ...
Hierarchical reinforcement learning is a general framework which attempts to accelerate policy learning in large domains. On the other hand, policy gradient reinforcement learning...
One of the key problems in reinforcement learning is balancing exploration and exploitation. Another is learning and acting in large or even continuous Markov decision processes (...
Lihong Li, Michael L. Littman, Christopher R. Mans...