Search Sciweavers | Sciweavers

20

UAI
2000

133views Artificial Intelligence» more UAI 2000»

PEGASUS: A policy search method for large MDPs and POMDPs

13 years 7 months ago

We propose a new approach to the problem of searching a space of policies for a Markov decision process (MDP) or a partially observable Markov decision process (POMDP), given a mo...

Andrew Y. Ng, Michael I. Jordan

claim paper

Read More »

20

click to vote

AAAI
2010

172views Intelligent Agents» more AAAI 2010»

Using Bisimulation for Policy Transfer in MDPs

13 years 7 months ago

Download www.cs.mcgill.ca

Knowledge transfer has been suggested as a useful approach for solving large Markov Decision Processes. The main idea is to compute a decision-making policy in one environment and...

Pablo Samuel Castro, Doina Precup

claim paper

Read More »

17

click to vote

JMLR
2010

189views more JMLR 2010»

Adaptive Step-size Policy Gradients with Average Reward Metric

13 years 12 days ago

Download jmlr.csail.mit.edu

In this paper, we propose a novel adaptive step-size approach for policy gradient reinforcement learning. A new metric is defined for policy gradients that measures the effect of ...

Takamitsu Matsubara, Tetsuro Morimura, Jun Morimot...

claim paper

Read More »

12

click to vote

AAAI
2006

86views Intelligent Agents» more AAAI 2006»

Targeting Specific Distributions of Trajectories in MDPs

13 years 7 months ago

Download www.cc.gatech.edu

We define TTD-MDPs, a novel class of Markov decision processes where the traditional goal of an agent is changed from finding an optimal trajectory through a state space to realiz...

David L. Roberts, Mark J. Nelson, Charles Lee Isbe...

claim paper

Read More »

20

click to vote

ECAI
2004
Springer

227views Artificial Intelligence» more ECAI 2004»

On-Line Search for Solving Markov Decision Processes via Heuristic Sampling

13 years 11 months ago

Download www.inra.fr

In the past, Markov Decision Processes (MDPs) have become a standard for solving problems of sequential decision under uncertainty. The usual request in this framework is the compu...

Laurent Péret, Frédérick Garc...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers