Sciweavers

178 search results - page 14 / 36
» Efficient Approximation of Optimal Control for Markov Games
Sort
View
AIPS
2009
14 years 10 months ago
Efficient Solutions to Factored MDPs with Imprecise Transition Probabilities
When modeling real-world decision-theoretic planning problems in the Markov decision process (MDP) framework, it is often impossible to obtain a completely accurate estimate of tr...
Karina Valdivia Delgado, Scott Sanner, Leliane Nun...
87
Voted
IJCAI
2001
14 years 10 months ago
R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning
R-max is a very simple model-based reinforcement learning algorithm which can attain near-optimal average reward in polynomial time. In R-max, the agent always maintains a complet...
Ronen I. Brafman, Moshe Tennenholtz
AIPS
2006
14 years 11 months ago
Solving Factored MDPs with Exponential-Family Transition Models
Markov decision processes (MDPs) with discrete and continuous state and action components can be solved efficiently by hybrid approximate linear programming (HALP). The main idea ...
Branislav Kveton, Milos Hauskrecht
CDC
2009
IEEE
132views Control Systems» more  CDC 2009»
15 years 2 months ago
Q-learning and Pontryagin's Minimum Principle
Abstract— Q-learning is a technique used to compute an optimal policy for a controlled Markov chain based on observations of the system controlled using a non-optimal policy. It ...
Prashant G. Mehta, Sean P. Meyn
CDC
2010
IEEE
167views Control Systems» more  CDC 2010»
14 years 4 months ago
Numerical methods for the optimization of nonlinear stochastic delay systems, and an application to internet regulation
The Markov chain approximation method is an effective and widely used approach for computing optimal values and controls for stochastic systems. It was extended to nonlinear (and p...
Harold J. Kushner