Sciweavers

567 search results - page 60 / 114
» Regularized Policy Iteration
Sort
View
192
Voted
MP
2011
14 years 3 months ago
An interior-point piecewise linear penalty method for nonlinear programming
We present an interior-point penalty method for nonlinear programming (NLP), where the merit function consists of a piecewise linear penalty function (PLPF) and an 2-penalty functi...
Lifeng Chen, Donald Goldfarb
126
Voted
HPCN
1997
Springer
15 years 4 months ago
Parallel Solution of Irregular, Sparse Matrix Problems Using High Performance Fortran
For regular, sparse, linear systems, like those derived from regular grids, using High Performance Fortran (HPF) for iterative solvers is straightforward. However, for irregular ma...
Eric de Sturler, Damian Loher
136
Voted
ICML
1999
IEEE
16 years 1 months ago
Least-Squares Temporal Difference Learning
Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...
Justin A. Boyan
107
Voted
NIPS
1998
15 years 1 months ago
Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms
In this paper, we address two issues of long-standing interest in the reinforcement learning literature. First, what kinds of performance guarantees can be made for Q-learning aft...
Michael J. Kearns, Satinder P. Singh
SISAP
2008
IEEE
98views Data Mining» more  SISAP 2008»
15 years 7 months ago
On Reinsertions in M-tree
In this paper we introduce a new M-tree building method, utilizing the classic idea of forced reinsertions. In case a leaf is about to split, some distant objects are removed from...
Jakub Lokoc, Tomás Skopal