Search Sciweavers | Sciweavers

56 search results - page 6 / 12

» Q-Learning in Continuous State and Action Spaces

click to vote

ASE
2004

167views more ASE 2004»

Cluster-Based Partial-Order Reduction

14 years 11 months ago

Download www.ics.ele.tue.nl

The verification of concurrent systems through an exhaustive traversal of the state space suffers from the infamous state-space-explosion problem, caused by the many interleavings ...

Twan Basten, Dragan Bosnacki, Marc Geilen

claim paper

Read More »

click to vote

ICML
2003
IEEE

151views Machine Learning» more ICML 2003»

Hierarchical Policy Gradient Algorithms

16 years 13 days ago

Download www.hpl.hp.com

Hierarchical reinforcement learning is a general framework which attempts to accelerate policy learning in large domains. On the other hand, policy gradient reinforcement learning...

Mohammad Ghavamzadeh, Sridhar Mahadevan

claim paper

Read More »

100

click to vote

ICML
2006
IEEE

131views Machine Learning» more ICML 2006»

PAC model-free reinforcement learning

16 years 13 days ago

Download cseweb.ucsd.edu

For a Markov Decision Process with finite state (size S) and action spaces (size A per state), we propose a new algorithm--Delayed Q-Learning. We prove it is PAC, achieving near o...

Alexander L. Strehl, Lihong Li, Eric Wiewiora, Joh...

claim paper

Read More »

108

Voted

IROS
2008
IEEE

144views Robotics» more IROS 2008»

Learning nonparametric policies by imitation

15 years 6 months ago

Download www.cs.washington.edu

— A long cherished goal in artiﬁcial intelligence has been the ability to endow a robot with the capacity to learn and generalize skills from watching a human teacher. Such an ...

David B. Grimes, Rajesh P. N. Rao

claim paper

Read More »

click to vote

UAI
2000

133views Artificial Intelligence» more UAI 2000»

PEGASUS: A policy search method for large MDPs and POMDPs

15 years 1 months ago

Download ai.stanford.edu

We propose a new approach to the problem of searching a space of policies for a Markov decision process (MDP) or a partially observable Markov decision process (POMDP), given a mo...

Andrew Y. Ng, Michael I. Jordan

claim paper

Read More »

« Prev « First page 6 / 12 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers