In models that define probabilities via energies, maximum likelihood learning typically involves using Markov Chain Monte Carlo to sample from the model’s distribution. If the ...
We present a novel discriminative training algorithm for n-gram language models for use in large vocabulary continuous speech recognition. The algorithm uses large margin estimati...
Current state-of-the-art speech recognition systems work quite well in controlled environments but their performance degrades severely in realistic acoustical conditions in reverb...
We consider the problem of decision-making with side information and unbounded loss functions. Inspired by probably approximately correct learning model, we use a slightly differe...
Previous work on statistical language modeling has shown that it is possible to train a feed-forward neural network to approximate probabilities over sequences of words, resulting...