Search Sciweavers | Sciweavers

55 search results - page 9 / 11

» Approximate Policy Iteration using Large-Margin Classifiers

click to vote

UAI
2001

194views Artificial Intelligence» more UAI 2001»

Expectation Propagation for approximate Bayesian inference

13 years 7 months ago

Download research.microsoft.com

This paper presents a new deterministic approximation technique in Bayesian networks. This method, "Expectation Propagation," unifies two previous techniques: assumed-de...

Thomas P. Minka

claim paper

Read More »

click to vote

CORR
2006
Springer

113views Education» more CORR 2006»

A Unified View of TD Algorithms; Introducing Full-Gradient TD and Equi-Gradient Descent TD

13 years 6 months ago

Download hal.inria.fr

This paper addresses the issue of policy evaluation in Markov Decision Processes, using linear function approximation. It provides a unified view of algorithms such as TD(), LSTD()...

Manuel Loth, Philippe Preux

claim paper

Read More »

click to vote

NIPS
2007

164views Information Technology» more NIPS 2007»

Incremental Natural Actor-Critic Algorithms

13 years 7 months ago

Download books.nips.cc

We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...

Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...

claim paper

Read More »

click to vote

ICML
1999
IEEE

168views Machine Learning» more ICML 1999»

Least-Squares Temporal Difference Learning

14 years 7 months ago

Download www.research.rutgers.edu

Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...

Justin A. Boyan

claim paper

Read More »

click to vote

IJCAI
2007

182views Artificial Intelligence» more IJCAI 2007»

A Fast Analytical Algorithm for Solving Markov Decision Processes with Real-Valued Resources

13 years 7 months ago

Download teamcore.usc.edu

Agents often have to construct plans that obey deadlines or, more generally, resource limits for real-valued resources whose consumption can only be characterized by probability d...

Janusz Marecki, Sven Koenig, Milind Tambe

claim paper

Read More »

« Prev « First page 9 / 11 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers