Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

13

UAI
2001

favoriteEmaildiscussreport

98views Artificial Intelligence» more UAI 2001»

Policy Improvement for POMDPs Using Normalized Importance Sampling

13 years 6 months ago

Policy Improvement for POMDPs Using Normalized Importance Sampling

Download www.cs.ucr.edu

We present a new method for estimating the expected return of a POMDP from experience. The estimator does not assume any knowledge of the POMDP, can estimate the returns for finite state controllers, allows experience to be gathered from arbitrary sequences of policies, and estimates the return for any new policy. We motivate the estimator from function-approximation and importance sampling points-of-view and derive its bias and variance. Although the estimator is biased, it has low variance and the bias is often irrelevant when the estimator is used for pair-wise comparisons. We conclude by extending the estimator to policies with memory and compare its performance in a greedy search algorithm to the REINFORCE algorithm showing an order of magnitude reduction in the number of trials required.

Christian R. Shelton

Real-time Traffic

Estimator | Finite State Controllers | Importance Sampling Points-of-view | UAI 2001 | UAI 2008 |

claim paper

Related Content

» Incremental Least Squares Policy Iteration for POMDPs

» Solving POMDPs using quadratically constrained linear programs

» A Variance Analysis for POMDP Policy Evaluation

» Valuebased observation compression for DECPOMDPs

» MemoryBounded Dynamic Programming for DECPOMDPs

» A Method to Adaptively Propagate the Set of Samples Used by Particle Filters

» Improved Policies for Drowsy Caches in Embedded Processors

» Fuzzy Model Based Recognition of Handwritten Hindi Characters

» Using Aggressor Thread Information to Improve Shared Cache Management for CMPs

Post Info
More Details (n/a)

Added	31 Oct 2010
Updated	31 Oct 2010
Type	Conference
Year	2001
Where	UAI
Authors	Christian R. Shelton

Comments (0)