Sciweavers

162 search results - page 4 / 33
» Topological Value Iteration Algorithm for Markov Decision Pr...
Sort
View
AI
2008
Springer
13 years 6 months ago
Reachability analysis of uncertain systems using bounded-parameter Markov decision processes
Verification of reachability properties for probabilistic systems is usually based on variants of Markov processes. Current methods assume an exact model of the dynamic behavior a...
Di Wu, Xenofon D. Koutsoukos
ICML
2010
IEEE
13 years 7 months ago
Convergence of Least Squares Temporal Difference Methods Under General Conditions
We consider approximate policy evaluation for finite state and action Markov decision processes (MDP) in the off-policy learning context and with the simulation-based least square...
Huizhen Yu
CORR
2010
Springer
105views Education» more  CORR 2010»
13 years 4 months ago
Optimism in Reinforcement Learning Based on Kullback-Leibler Divergence
We consider model-based reinforcement learning in finite Markov Decision Processes (MDPs), focussing on so-called optimistic strategies. Optimism is usually implemented by carryin...
Sarah Filippi, Olivier Cappé, Aurelien Gari...
ICML
2006
IEEE
14 years 7 months ago
Fast direct policy evaluation using multiscale analysis of Markov diffusion processes
Policy evaluation is a critical step in the approximate solution of large Markov decision processes (MDPs), typically requiring O(|S|3 ) to directly solve the Bellman system of |S...
Mauro Maggioni, Sridhar Mahadevan
IJCAI
2007
13 years 7 months ago
First Order Decision Diagrams for Relational MDPs
Dynamic programming algorithms provide a basic tool identifying optimal solutions in Markov Decision Processes (MDP). The paper develops a representation for decision diagrams sui...
Chenggang Wang, Saket Joshi, Roni Khardon