Wormholes Improve Contrastive Divergence

11 years 11 months ago
Wormholes Improve Contrastive Divergence
In models that define probabilities via energies, maximum likelihood learning typically involves using Markov Chain Monte Carlo to sample from the model’s distribution. If the Markov chain is started at the data distribution, learning often works well even if the chain is only run for a few time steps [3]. But if the data distribution contains modes separated by regions of very low density, brief MCMC will not ensure that different modes have the correct relative energies because it cannot move particles from one mode to another. We show how to improve brief MCMC by allowing long-range moves that are suggested by the data distribution. If the model is approximately correct, these long-range moves have a reasonable acceptance rate.
Geoffrey E. Hinton, Max Welling, Andriy Mnih
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2003
Where NIPS
Authors Geoffrey E. Hinton, Max Welling, Andriy Mnih
Comments (0)