We demonstrate that log-linear grammars with latent variables can be practically trained using discriminative methods. Central to efficient discriminative training is a hierarchi...
We address classification problems for which the training instances are governed by a distribution that is allowed to differ arbitrarily from the test distribution--problems also ...