Sciweavers

63 search results - page 11 / 13
» Mean field for Markov Decision Processes: from Discrete to C...
Sort
View
CORR
2006
Springer
113views Education» more  CORR 2006»
13 years 6 months ago
A Unified View of TD Algorithms; Introducing Full-Gradient TD and Equi-Gradient Descent TD
This paper addresses the issue of policy evaluation in Markov Decision Processes, using linear function approximation. It provides a unified view of algorithms such as TD(), LSTD()...
Manuel Loth, Philippe Preux
IJRR
2008
101views more  IJRR 2008»
13 years 6 months ago
Motion Planning Under Uncertainty for Image-guided Medical Needle Steering
We develop a new motion planning algorithm for a variant of a Dubins car with binary left/right steering and apply it to steerable needles, a new class of flexible beveltip medica...
Ron Alterovitz, Michael S. Branicky, Kenneth Y. Go...
HT
2009
ACM
14 years 27 days ago
Improving recommender systems with adaptive conversational strategies
Conversational recommender systems (CRSs) assist online users in their information-seeking and decision making tasks by supporting an interactive process. Although these processes...
Tariq Mahmood, Francesco Ricci
NIPS
1996
13 years 7 months ago
Multidimensional Triangulation and Interpolation for Reinforcement Learning
Dynamic Programming, Q-learning and other discrete Markov Decision Process solvers can be applied to continuous d-dimensional state-spaces by quantizing the state space into an arr...
Scott Davies
ECML
2007
Springer
14 years 16 days ago
Policy Gradient Critics
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Daan Wierstra, Jürgen Schmidhuber