Partially Observable Markov Decision Processes (POMDPs) provide a rich framework for sequential decision-making under uncertainty in stochastic domains. However, solving a POMDP i...
For a Markov Decision Process with finite state (size S) and action spaces (size A per state), we propose a new algorithm--Delayed Q-Learning. We prove it is PAC, achieving near o...
Alexander L. Strehl, Lihong Li, Eric Wiewiora, Joh...
We propose a framework for policy generation in continuoustime stochastic domains with concurrent actions and events of uncertain duration. We make no assumptions regarding the co...
Our setting is a Partially Observable Markov Decision Process with continuous state, observation and action spaces. Decisions are based on a Particle Filter for estimating the bel...
We adopt the decision-theoretic principle of expected utility maximization as a paradigm for designing autonomous rational agents operating in multi-agent environments. We use the...