Sciweavers

CORR
2012
Springer
235views Education» more  CORR 2012»
12 years 9 days ago
An Incremental Sampling-based Algorithm for Stochastic Optimal Control
Abstract— In this paper, we consider a class of continuoustime, continuous-space stochastic optimal control problems. Building upon recent advances in Markov chain approximation ...
Vu Anh Huynh, Sertac Karaman, Emilio Frazzoli
AIPS
2011
12 years 8 months ago
Heuristic Search for Generalized Stochastic Shortest Path MDPs
Research in efficient methods for solving infinite-horizon MDPs has so far concentrated primarily on discounted MDPs and the more general stochastic shortest path problems (SSPs...
Andrey Kolobov, Mausam, Daniel S. Weld, Hector Gef...
MP
2002
113views more  MP 2002»
13 years 4 months ago
A note on sensitivity of value functions of mathematical programs with complementarity constraints
Using standard nonlinear programming (NLP) theory, we establish formulas for first and second order directional derivatives for optimal value functions of parametric mathematical ...
Xinmin Hu, Daniel Ralph
JGO
2008
115views more  JGO 2008»
13 years 4 months ago
Smoothing by mollifiers. Part I: semi-infinite optimization
We show that a compact feasible set of a standard semi-infinite optimization problem can be approximated arbitrarily well by a level set of a single smooth function with certain r...
Hubertus Th. Jongen, Oliver Stein
AAAI
2006
13 years 6 months ago
Compact, Convex Upper Bound Iteration for Approximate POMDP Planning
Partially observable Markov decision processes (POMDPs) are an intuitive and general way to model sequential decision making problems under uncertainty. Unfortunately, even approx...
Tao Wang, Pascal Poupart, Michael H. Bowling, Dale...
WSC
2007
13 years 6 months ago
Path-wise estimators and cross-path regressions: an application to evaluating portfolio strategies
Recently developed dual techniques allow us to evaluate a given sub-optimal dynamic portfolio policy by using the policy to construct an upper bound on the optimal value function....
Martin B. Haugh, Ashish Jain
FLAIRS
2008
13 years 7 months ago
A Novel Prioritization Technique for Solving Markov Decision Processes
We address the problem of computing an optimal value function for Markov decision processes. Since finding this function quickly and accurately requires substantial computation ef...
Jilles Steeve Dibangoye, Brahim Chaib-draa, Abdel-...
PRICAI
2000
Springer
13 years 8 months ago
A POMDP Approximation Algorithm That Anticipates the Need to Observe
This paper introduces the even-odd POMDP, an approximation to POMDPs in which the world is assumed to be fully observable every other time step. The even-odd POMDP can be converte...
Valentina Bayer Zubek, Thomas G. Dietterich
ICML
2001
IEEE
14 years 5 months ago
Symmetry in Markov Decision Processes and its Implications for Single Agent and Multiagent Learning
This paper examines the notion of symmetry in Markov decision processes (MDPs). We define symmetry for an MDP and show how it can be exploited for more effective learning in singl...
Martin Zinkevich, Tucker R. Balch
ICML
2005
IEEE
14 years 5 months ago
Finite time bounds for sampling based fitted value iteration
In this paper we consider sampling based fitted value iteration for discounted, large (possibly infinite) state space, finite action Markovian Decision Problems where only a gener...
Csaba Szepesvári, Rémi Munos