Sciweavers

2711 search results - page 149 / 543
» Convergence of the Wake-Sleep Algorithm
Sort
View
ICMLA
2010
15 years 2 months ago
Multimodal Parameter-exploring Policy Gradients
Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...
Frank Sehnke, Alex Graves, Christian Osendorfer, J...
CDC
2010
IEEE
14 years 11 months ago
Stochastic approximation for consensus with general time-varying weight matrices
This paper considers consensus problems with delayed noisy measurements, and stochastic approximation is used to achieve mean square consensus. For stochastic approximation based c...
Minyi Huang
CDC
2010
IEEE
124views Control Systems» more  CDC 2010»
14 years 11 months ago
Hybrid control for navigation of shape-accelerated underactuated balancing systems
This paper presents a hybrid control strategy for navigation of shape-accelerated underactuated balancing systems with dynamic constraints. It extends the concept of sequential com...
Umashankar Nagarajan, George Kantor, Ralph L. Holl...
SIAMNUM
2010
103views more  SIAMNUM 2010»
14 years 11 months ago
Hybridization and Postprocessing Techniques for Mixed Eigenfunctions
Abstract. We introduce hybridization and postprocessing techniques for the RaviartThomas approximation of second-order elliptic eigenvalue problems. Hybridization reduces the Ravia...
Bernardo Cockburn, Jayadeep Gopalakrishnan, F. Li,...
AAAI
2011
14 years 4 months ago
Dual Decomposition for Marginal Inference
We present a dual decomposition approach to the treereweighted belief propagation objective. Each tree in the tree-reweighted bound yields one subproblem, which can be solved with...
Justin Domke