Sciweavers

86 search results - page 14 / 18
» Estimation and Approximation Bounds for Gradient-Based Reinf...
Sort
View
115
Voted
ICML
2008
IEEE
16 years 2 months ago
Sample-based learning and search with permanent and transient memories
We present a reinforcement learning architecture, Dyna-2, that encompasses both samplebased learning and sample-based search, and that generalises across states during both learni...
David Silver, Martin Müller 0003, Richard S. ...
107
Voted
ICML
2004
IEEE
16 years 2 months ago
Approximate inference by Markov chains on union spaces
A standard method for approximating averages in probabilistic models is to construct a Markov chain in the product space of the random variables with the desired equilibrium distr...
Max Welling, Michal Rosen-Zvi, Yee Whye Teh
120
Voted
NIPS
1998
15 years 3 months ago
Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms
In this paper, we address two issues of long-standing interest in the reinforcement learning literature. First, what kinds of performance guarantees can be made for Q-learning aft...
Michael J. Kearns, Satinder P. Singh
SODA
2008
ACM
184views Algorithms» more  SODA 2008»
15 years 3 months ago
Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm
The problem of maximizing a concave function f(x) in a simplex S can be solved approximately by a simple greedy algorithm. For given k, the algorithm can find a point x(k) on a k-...
Kenneth L. Clarkson
151
Voted
TNN
2010
216views Management» more  TNN 2010»
14 years 8 months ago
Simplifying mixture models through function approximation
Finite mixture model is a powerful tool in many statistical learning problems. In this paper, we propose a general, structure-preserving approach to reduce its model complexity, w...
Kai Zhang, James T. Kwok