Search Sciweavers | Sciweavers

85 search results - page 3 / 17

» Approximate Policy Iteration with a Policy Language Bias

109

click to vote

ICRA
2009
IEEE

143views Robotics» more ICRA 2009»

Least absolute policy iteration for robust value function approximation

15 years 8 months ago

Download sugiyama-www.cs.titech.ac.jp

Abstract— Least-squares policy iteration is a useful reinforcement learning method in robotics due to its computational efﬁciency. However, it tends to be sensitive to outliers...

Masashi Sugiyama, Hirotaka Hachiya, Hisashi Kashim...

claim paper

Read More »

125

click to vote

Publication

222views

Algorithms and Bounds for Rollout Sampling Approximate Policy Iteration

15 years 10 months ago

Download arxiv.org

Abstract: Several approximate policy iteration schemes without value functions, which focus on policy representation using classifiers and address policy learning as a supervis...

Christos Dimitrakakis, Michail G. Lagoudakis

posted by olethros

Read More »

115

click to vote

ECML
2006
Springer

141views Machine Learning» more ECML 2006»

Approximate Policy Iteration for Closed-Loop Learning of Visual Tasks

15 years 5 months ago

Download www.montefiore.ulg.ac.be

Abstract. Approximate Policy Iteration (API) is a reinforcement learning paradigm that is able to solve high-dimensional, continuous control problems. We propose to exploit API for...

Sébastien Jodogne, Cyril Briquet, Justus H....

claim paper

Read More »

126

click to vote

NIPS
2008

165views Information Technology» more NIPS 2008»

Regularized Policy Iteration

15 years 3 months ago

Download webdocs.cs.ualberta.ca

In this paper we consider approximate policy-iteration-based reinforcement learning algorithms. In order to implement a flexible function approximation scheme we propose the use o...

Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csab...

claim paper

Read More »

109

Voted

ATAL
2008
Springer

123views Intelligent Agents» more ATAL 2008»

Sigma point policy iteration

15 years 3 months ago

Download web.mit.edu

In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...

Michael H. Bowling, Alborz Geramifard, David Winga...

claim paper

Read More »

« Prev « First page 3 / 17 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers