Search Sciweavers | Sciweavers

18 search results - page 1 / 4

» Incremental Least Squares Policy Iteration for POMDPs

click to vote

AAAI
2006

146views Intelligent Agents» more AAAI 2006»

Incremental Least Squares Policy Iteration for POMDPs

13 years 6 months ago

Download www.aaai.org

We present a new algorithm, called incremental least squares policy iteration (ILSPI), for finding the infinite-horizon stationary policy for partially observable Markov decision ...

Hui Li, Xuejun Liao, Lawrence Carin

claim paper

Read More »

click to vote

ECAI
2006
Springer

245views Artificial Intelligence» more ECAI 2006»

Least Squares SVM for Least Squares TD Learning

13 years 8 months ago

Download homepages.feis.herts.ac.uk

Abstract. We formulate the problem of least squares temporal difference learning (LSTD) in the framework of least squares SVM (LS-SVM). To cope with the large amount (and possible ...

Tobias Jung, Daniel Polani

claim paper

Read More »

click to vote

PKDD
2009
Springer

169views Data Mining» more PKDD 2009»

Hybrid Least-Squares Algorithms for Approximate Policy Evaluation

13 years 11 months ago

Download www.cs.umass.edu

The goal of approximate policy evaluation is to “best” represent a target value function according to a speciﬁc criterion. Temporal difference methods and Bellman residual m...

Jeffrey Johns, Marek Petrik, Sridhar Mahadevan

claim paper

Read More »

click to vote

ICML
1999
IEEE

168views Machine Learning» more ICML 1999»

Least-Squares Temporal Difference Learning

14 years 5 months ago

Download www.research.rutgers.edu

Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...

Justin A. Boyan

claim paper

Read More »

click to vote

NIPS
2001

206views Information Technology» more NIPS 2001»

Model-Free Least-Squares Policy Iteration

13 years 6 months ago

Download www.cs.duke.edu

We propose a new approach to reinforcement learning which combines least squares function approximation with policy iteration. Our method is model-free and completely off policy. ...

Michail G. Lagoudakis, Ronald Parr

claim paper

Read More »

« Prev « First page 1 / 4 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers