We consider the problem of multi-task reinforcement learning, where the agent needs to solve a sequence of Markov Decision Processes (MDPs) chosen randomly from a fixed but unknow...
Aaron Wilson, Alan Fern, Soumya Ray, Prasad Tadepa...
We1 present a new actor-critic learning model in which a Bayesian class of non-parametric critics, using Gaussian process temporal difference learning is used. Such critics model ...
We present a general framework for defining priors on model structure and sampling from the posterior using the Metropolis-Hastings algorithm. The key ideas are that structure pri...
Selection of an optimal estimator typically relies on either supervised training samples (pairs of measurements and their associated true values), or a prior probability model for...
Information filtering has made considerable progress in recent years.The predominant approaches are content-based methods and collaborative methods. Researchers have largely conc...