Search Sciweavers | Sciweavers

34 search results - page 6 / 7

» Using Temporal Neighborhoods to Adapt Function Approximators...

click to vote

NIPS
1996

192views Information Technology» more NIPS 1996»

Multidimensional Triangulation and Interpolation for Reinforcement Learning

13 years 6 months ago

Download www.cs.cmu.edu

Dynamic Programming, Q-learning and other discrete Markov Decision Process solvers can be applied to continuous d-dimensional state-spaces by quantizing the state space into an arr...

Scott Davies

claim paper

Read More »

click to vote

NIPS
1994

90views Information Technology» more NIPS 1994»

Reinforcement Learning with Soft State Aggregation

13 years 6 months ago

Download www.eecs.umich.edu

It is widely accepted that the use of more compact representations than lookup tables is crucial to scaling reinforcement learning (RL) algorithms to real-world problems. Unfortun...

Satinder P. Singh, Tommi Jaakkola, Michael I. Jord...

claim paper

Read More »

click to vote

ATAL
2008
Springer

123views Intelligent Agents» more ATAL 2008»

Sigma point policy iteration

13 years 6 months ago

Download web.mit.edu

In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...

Michael H. Bowling, Alborz Geramifard, David Winga...

claim paper

Read More »

click to vote

ICML
2009
IEEE

194views Machine Learning» more ICML 2009»

Binary action search for learning continuous-action control policies

14 years 5 months ago

Download www.intelligence.tuc.gr

Reinforcement Learning methods for controlling stochastic processes typically assume a small and discrete action space. While continuous action spaces are quite common in real-wor...

Jason Pazis, Michail G. Lagoudakis

claim paper

Read More »

click to vote

ECML
2006
Springer

146views Machine Learning» more ECML 2006»

Task-Driven Discretization of the Joint Space of Visual Percepts and Continuous Actions

13 years 8 months ago

Download www.montefiore.ulg.ac.be

We target the problem of closed-loop learning of control policies that map visual percepts to continuous actions. Our algorithm, called Reinforcement Learning of Joint Classes (RLJ...

Sébastien Jodogne, Justus H. Piater

claim paper

Read More »

« Prev « First page 6 / 7 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers