Search Sciweavers | Sciweavers

63 search results - page 11 / 13

» Mean field for Markov Decision Processes: from Discrete to C...

click to vote

CORR
2006
Springer

113views Education» more CORR 2006»

A Unified View of TD Algorithms; Introducing Full-Gradient TD and Equi-Gradient Descent TD

13 years 6 months ago

Download hal.inria.fr

This paper addresses the issue of policy evaluation in Markov Decision Processes, using linear function approximation. It provides a unified view of algorithms such as TD(), LSTD()...

Manuel Loth, Philippe Preux

claim paper

Read More »

click to vote

IJRR
2008

101views more IJRR 2008»

Motion Planning Under Uncertainty for Image-guided Medical Needle Steering

13 years 6 months ago

Download dora.cwru.edu

We develop a new motion planning algorithm for a variant of a Dubins car with binary left/right steering and apply it to steerable needles, a new class of flexible beveltip medica...

Ron Alterovitz, Michael S. Branicky, Kenneth Y. Go...

claim paper

Read More »

click to vote

HT
2009
ACM

146views Internet Technology» more HT 2009»

Improving recommender systems with adaptive conversational strategies

14 years 27 days ago

Download www.inf.unibz.it

Conversational recommender systems (CRSs) assist online users in their information-seeking and decision making tasks by supporting an interactive process. Although these processes...

Tariq Mahmood, Francesco Ricci

claim paper

Read More »

click to vote

NIPS
1996

192views Information Technology» more NIPS 1996»

Multidimensional Triangulation and Interpolation for Reinforcement Learning

13 years 7 months ago

Download www.cs.cmu.edu

Dynamic Programming, Q-learning and other discrete Markov Decision Process solvers can be applied to continuous d-dimensional state-spaces by quantizing the state space into an arr...

Scott Davies

claim paper

Read More »

click to vote

ECML
2007
Springer

192views Machine Learning» more ECML 2007»

Policy Gradient Critics

14 years 16 days ago

Download www.idsia.ch

We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...

Daan Wierstra, Jürgen Schmidhuber

claim paper

Read More »

« Prev « First page 11 / 13 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers