We describe an algorithm for learning in the presence of multiple criteria. Our technique generalizes previous approaches in that it can learn optimal policies for all linear pref...
We introduce relational temporal difference learning as an effective approach to solving multi-agent Markov decision problems with large state spaces. Our algorithm uses temporal ...
Gaussian Process Temporal Difference (GPTD) learning offers a Bayesian solution to the policy evaluation problem of reinforcement learning. In this paper we extend the GPTD framew...
A general game player is an agent capable of taking as input a description of a game’s rules in a formal language and proceeding to play without any subsequent human input. To do...
We show that the class of monotone 2O( √ log n)-term DNF formulae can be PAC learned in polynomial time under the uniform distribution from random examples only. This is an expo...