Sciweavers

56 search results - page 6 / 12
» Q-Learning in Continuous State and Action Spaces
Sort
View
ASE
2004
167views more  ASE 2004»
14 years 11 months ago
Cluster-Based Partial-Order Reduction
The verification of concurrent systems through an exhaustive traversal of the state space suffers from the infamous state-space-explosion problem, caused by the many interleavings ...
Twan Basten, Dragan Bosnacki, Marc Geilen
ICML
2003
IEEE
16 years 13 days ago
Hierarchical Policy Gradient Algorithms
Hierarchical reinforcement learning is a general framework which attempts to accelerate policy learning in large domains. On the other hand, policy gradient reinforcement learning...
Mohammad Ghavamzadeh, Sridhar Mahadevan
ICML
2006
IEEE
16 years 13 days ago
PAC model-free reinforcement learning
For a Markov Decision Process with finite state (size S) and action spaces (size A per state), we propose a new algorithm--Delayed Q-Learning. We prove it is PAC, achieving near o...
Alexander L. Strehl, Lihong Li, Eric Wiewiora, Joh...
108
Voted
IROS
2008
IEEE
144views Robotics» more  IROS 2008»
15 years 6 months ago
Learning nonparametric policies by imitation
— A long cherished goal in artificial intelligence has been the ability to endow a robot with the capacity to learn and generalize skills from watching a human teacher. Such an ...
David B. Grimes, Rajesh P. N. Rao
UAI
2000
15 years 1 months ago
PEGASUS: A policy search method for large MDPs and POMDPs
We propose a new approach to the problem of searching a space of policies for a Markov decision process (MDP) or a partially observable Markov decision process (POMDP), given a mo...
Andrew Y. Ng, Michael I. Jordan