Sciweavers

86 search results - page 14 / 18
» Estimation and Approximation Bounds for Gradient-Based Reinf...
Sort
View
ICML
2008
IEEE
15 years 10 months ago
Sample-based learning and search with permanent and transient memories
We present a reinforcement learning architecture, Dyna-2, that encompasses both samplebased learning and sample-based search, and that generalises across states during both learni...
David Silver, Martin Müller 0003, Richard S. ...
ICML
2004
IEEE
15 years 10 months ago
Approximate inference by Markov chains on union spaces
A standard method for approximating averages in probabilistic models is to construct a Markov chain in the product space of the random variables with the desired equilibrium distr...
Max Welling, Michal Rosen-Zvi, Yee Whye Teh
NIPS
1998
14 years 10 months ago
Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms
In this paper, we address two issues of long-standing interest in the reinforcement learning literature. First, what kinds of performance guarantees can be made for Q-learning aft...
Michael J. Kearns, Satinder P. Singh
SODA
2008
ACM
184views Algorithms» more  SODA 2008»
14 years 11 months ago
Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm
The problem of maximizing a concave function f(x) in a simplex S can be solved approximately by a simple greedy algorithm. For given k, the algorithm can find a point x(k) on a k-...
Kenneth L. Clarkson
111
Voted
TNN
2010
216views Management» more  TNN 2010»
14 years 4 months ago
Simplifying mixture models through function approximation
Finite mixture model is a powerful tool in many statistical learning problems. In this paper, we propose a general, structure-preserving approach to reduce its model complexity, w...
Kai Zhang, James T. Kwok