Sciweavers

355 search results - page 32 / 71
» Online Learning and Exploiting Relational Models in Reinforc...
Sort
View
COLT
2010
Springer
14 years 7 months ago
An Asymptotically Optimal Bandit Algorithm for Bounded Support Models
Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...
Junya Honda, Akimichi Takemura
ICML
2010
IEEE
14 years 7 months ago
Online Prediction with Privacy
In this paper, we consider online prediction from expert advice in a situation where each expert observes its own loss at each time while the loss cannot be disclosed to others fo...
Jun Sakuma, Hiromi Arai
UAI
2003
14 years 11 months ago
On the Convergence of Bound Optimization Algorithms
Many practitioners who use EM and related algorithms complain that they are sometimes slow. When does this happen, and what can be done about it? In this paper, we study the gener...
Ruslan Salakhutdinov, Sam T. Roweis, Zoubin Ghahra...
ICDM
2005
IEEE
137views Data Mining» more  ICDM 2005»
15 years 3 months ago
Leveraging Relational Autocorrelation with Latent Group Models
The presence of autocorrelation provides a strong motivation for using relational learning and inference techniques. Autocorrelation is a statistical dependence between the values...
Jennifer Neville, David Jensen
LREC
2008
139views Education» more  LREC 2008»
14 years 11 months ago
Identification of Comparable Argument-Head Relations in Parallel Corpora
We present the machine learning framework that we are developing, in order to support explorative search for non-trivial linguistic configurations in low-density languages (langua...
Kathrin Spreyer, Jonas Kuhn, Bettina Schrader