With a few exceptions, discriminative training in statistical machine translation (SMT) has been content with tuning weights for large feature sets on small development data. Evid...
Empirical evidence suggests that hashing is an effective strategy for dimensionality reduction and practical nonparametric estimation. In this paper we provide exponential tail bo...
Kilian Q. Weinberger, Anirban Dasgupta, John Langf...
We show how the regularizer of Transductive Support Vector Machines (TSVM) can be trained by stochastic gradient descent for linear models and multi-layer architectures. The resul...
Michael Karlen, Jason Weston, Ayse Erkan, Ronan Co...
Logistic Regression (LR) has been widely used in statistics for many years, and has received extensive study in machine learning community recently due to its close relations to S...
Jian Zhang, Rong Jin, Yiming Yang, Alexander G. Ha...
We show how variational Bayesian inference can be implemented for very large generalized linear models. Our relaxation is proven to be a convex problem for any log-concave model. ...