Sciweavers

15 search results - page 3 / 3
» Incremental plan aggregation for generating policies in MDPs
Sort
View
CORR
2012
Springer
235views Education» more  CORR 2012»
12 years 22 days ago
An Incremental Sampling-based Algorithm for Stochastic Optimal Control
Abstract— In this paper, we consider a class of continuoustime, continuous-space stochastic optimal control problems. Building upon recent advances in Markov chain approximation ...
Vu Anh Huynh, Sertac Karaman, Emilio Frazzoli
NIPS
1998
13 years 6 months ago
Gradient Descent for General Reinforcement Learning
A simple learning rule is derived, the VAPS algorithm, which can be instantiated to generate a wide range of new reinforcementlearning algorithms. These algorithms solve a number ...
Leemon C. Baird III, Andrew W. Moore
ATAL
2009
Springer
13 years 11 months ago
Constraint-based dynamic programming for decentralized POMDPs with structured interactions
Decentralized partially observable MDPs (DEC-POMDPs) provide a rich framework for modeling decision making by a team of agents. Despite rapid progress in this area, the limited sc...
Akshat Kumar, Shlomo Zilberstein
ATAL
2009
Springer
13 years 11 months ago
Point-based incremental pruning heuristic for solving finite-horizon DEC-POMDPs
Recent scaling up of decentralized partially observable Markov decision process (DEC-POMDP) solvers towards realistic applications is mainly due to approximate methods. Of this fa...
Jilles Steeve Dibangoye, Abdel-Illah Mouaddib, Bra...
UAI
2008
13 years 6 months ago
Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping
We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available afte...
Richard S. Sutton, Csaba Szepesvári, Alborz...