Sciweavers

2005 search results - page 196 / 401
» Decisive Markov Chains
Sort
View
ICALP
2009
Springer
16 years 2 months ago
Reachability in Stochastic Timed Games
We define stochastic timed games, which extend two-player timed games with probabilities (following a recent approach by Baier et al), and which extend in a natural way continuous-...
Patricia Bouyer, Vojtech Forejt
97
Voted
ICRA
2007
IEEE
126views Robotics» more  ICRA 2007»
15 years 8 months ago
A formal framework for robot learning and control under model uncertainty
— While the Partially Observable Markov Decision Process (POMDP) provides a formal framework for the problem of robot control under uncertainty, it typically assumes a known and ...
Robin Jaulmes, Joelle Pineau, Doina Precup
ECML
2007
Springer
15 years 8 months ago
Safe Q-Learning on Complete History Spaces
In this article, we present an idea for solving deterministic partially observable markov decision processes (POMDPs) based on a history space containing sequences of past observat...
Stephan Timmer, Martin Riedmiller
ICANN
2007
Springer
15 years 8 months ago
Solving Deep Memory POMDPs with Recurrent Policy Gradients
Abstract. This paper presents Recurrent Policy Gradients, a modelfree reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov...
Daan Wierstra, Alexander Förster, Jan Peters,...
148
Voted
ICML
2006
IEEE
15 years 7 months ago
Automatic basis function construction for approximate dynamic programming and reinforcement learning
We address the problem of automatically constructing basis functions for linear approximation of the value function of a Markov Decision Process (MDP). Our work builds on results ...
Philipp W. Keller, Shie Mannor, Doina Precup