Sciweavers

7 search results - page 1 / 2
» A Variance Analysis for POMDP Policy Evaluation
Sort
View
AAAI
2008
13 years 7 months ago
A Variance Analysis for POMDP Policy Evaluation
Partially Observable Markov Decision Processes have been studied widely as a model for decision making under uncertainty, and a number of methods have been developed to find the s...
Mahdi Milani Fard, Joelle Pineau, Peng Sun
NIPS
2008
13 years 6 months ago
Particle Filter-based Policy Gradient in POMDPs
Our setting is a Partially Observable Markov Decision Process with continuous state, observation and action spaces. Decisions are based on a Particle Filter for estimating the bel...
Pierre-Arnaud Coquelin, Romain Deguest, Rém...
ATAL
2009
Springer
13 years 11 months ago
SarsaLandmark: an algorithm for learning in POMDPs with landmarks
Reinforcement learning algorithms that use eligibility traces, such as Sarsa(λ), have been empirically shown to be effective in learning good estimated-state-based policies in pa...
Michael R. James, Satinder P. Singh
NIPS
2008
13 years 6 months ago
Signal-to-Noise Ratio Analysis of Policy Gradient Algorithms
Policy gradient (PG) reinforcement learning algorithms have strong (local) convergence guarantees, but their learning performance is typically limited by a large variance in the e...
John W. Roberts, Russ Tedrake
PROMAS
2004
Springer
13 years 10 months ago
Coordinating Teams in Uncertain Environments: A Hybrid BDI-POMDP Approach
Distributed partially observable Markov decision problems (POMDPs) have emerged as a popular decision-theoretic approach for planning for multiagent teams, where it is imperative f...
Ranjit Nair, Milind Tambe