We present an algorithm called Optimistic Linear Programming (OLP) for learning to optimize average reward in an irreducible but otherwise unknown Markov decision process (MDP). O...
Accurate knowledge of the effect of parameter uncertainty on process design and operation is essential for optimal and feasible operation of a process plant. Existing approaches de...
We consider model-based reinforcement learning in finite Markov Decision Processes (MDPs), focussing on so-called optimistic strategies. Optimism is usually implemented by carryin...
The analysis of online least squares estimation is at the heart of many stochastic sequential decision-making problems. We employ tools from the self-normalized processes to provi...
This paper presents properties and results of a new framework for sequential decision-making in multiagent settings called interactive partially observable Markov decision process...