Sciweavers

TNN
2008

Adaptive Importance Sampling to Accelerate Training of a Neural Probabilistic Language Model

13 years 4 months ago
Adaptive Importance Sampling to Accelerate Training of a Neural Probabilistic Language Model
Previous work on statistical language modeling has shown that it is possible to train a feed-forward neural network to approximate probabilities over sequences of words, resulting in significant error reduction when compared to standard baseline models based on n-grams. However, training the neural network model with the maximum likelihood criterion requires computations proportional to the number of words in the vocabulary. We introduce adaptive importance sampling as a way to accelerate training of the model. The idea is to use an adaptive n-gram model to track the conditional distributions produced by the neural network. We show that a very significant speed-up can be obtained on standard problems. 1
Yoshua Bengio, Jean-Sébastien Senecal
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2008
Where TNN
Authors Yoshua Bengio, Jean-Sébastien Senecal
Comments (0)