: The generic problem of estimation and inference given a sequence of i.i.d. samples has been extensively studied in the statistics, property testing, and learning communities. A n...
Many NLP tasks rely on accurately estimating word dependency probabilities P(w1|w2), where the words w1 and w2 have a particular relationship (such as verb-object). Because of the...
Kristina Toutanova, Christopher D. Manning, Andrew...
Many applications in multiagent learning are essentially convex optimization problems in which agents have only limited communication and partial information about the function be...
Renato L. G. Cavalcante, Alex Rogers, Nicholas R. ...
Several reinforcement-learning techniques have already been applied to the Acrobot control problem, using linear function approximators to estimate the value function. In this pape...
We1 present a new actor-critic learning model in which a Bayesian class of non-parametric critics, using Gaussian process temporal difference learning is used. Such critics model ...