Search Sciweavers | Sciweavers

56 search results - page 7 / 12

» Q-Learning in Continuous State and Action Spaces

click to vote

WDAG
2007
Springer

73views Algorithms» more WDAG 2007»

On Self-stabilizing Synchronous Actions Despite Byzantine Attacks

15 years 5 months ago

Download www.cs.huji.ac.il

Consider a distributed network of n nodes that is connected to a global source of “beats”. All nodes receive the “beats” simultaneously, and operate in lock-step. A scheme ...

Danny Dolev, Ezra N. Hoch

claim paper

Read More »

click to vote

ICML
2007
IEEE

204views Machine Learning» more ICML 2007»

Constructing basis functions from directed graphs for value function approximation

16 years 13 days ago

Download www.machinelearning.org

Basis functions derived from an undirected graph connecting nearby samples from a Markov decision process (MDP) have proven useful for approximating value functions. The success o...

Jeffrey Johns, Sridhar Mahadevan

claim paper

Read More »

105

click to vote

Publication

222views

Algorithms and Bounds for Rollout Sampling Approximate Policy Iteration

15 years 8 months ago

Download arxiv.org

Abstract: Several approximate policy iteration schemes without value functions, which focus on policy representation using classifiers and address policy learning as a supervis...

Christos Dimitrakakis, Michail G. Lagoudakis

posted by olethros

Read More »

104

click to vote

NIPS
2008

116views Information Technology» more NIPS 2008»

Particle Filter-based Policy Gradient in POMDPs

15 years 1 months ago

Download eprints.pascal-network.org

Our setting is a Partially Observable Markov Decision Process with continuous state, observation and action spaces. Decisions are based on a Particle Filter for estimating the bel...

Pierre-Arnaud Coquelin, Romain Deguest, Rém...

claim paper

Read More »

click to vote

ICML
2004
IEEE

156views Machine Learning» more ICML 2004»

Learning to fly by combining reinforcement learning with behavioural cloning

16 years 13 days ago

Download ccc.inaoep.mx

Reinforcement learning deals with learning optimal or near optimal policies while interacting with the environment. Application domains with many continuous variables are difficul...

Eduardo F. Morales, Claude Sammut

claim paper

Read More »

« Prev « First page 7 / 12 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers