Sciweavers

40 search results - page 3 / 8
» Parametric regret in uncertain Markov decision processes
Sort
View
NIPS
2007
13 years 7 months ago
Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs
We present an algorithm called Optimistic Linear Programming (OLP) for learning to optimize average reward in an irreducible but otherwise unknown Markov decision process (MDP). O...
Ambuj Tewari, Peter L. Bartlett
ANOR
2004
143views more  ANOR 2004»
13 years 5 months ago
Model Independent Parametric Decision Making
Accurate knowledge of the effect of parameter uncertainty on process design and operation is essential for optimal and feasible operation of a process plant. Existing approaches de...
Ipsita Banerjee, Marianthi G. Ierapetritou
CORR
2010
Springer
105views Education» more  CORR 2010»
13 years 4 months ago
Optimism in Reinforcement Learning Based on Kullback-Leibler Divergence
We consider model-based reinforcement learning in finite Markov Decision Processes (MDPs), focussing on so-called optimistic strategies. Optimism is usually implemented by carryin...
Sarah Filippi, Olivier Cappé, Aurelien Gari...
CORR
2011
Springer
202views Education» more  CORR 2011»
13 years 17 days ago
Online Least Squares Estimation with Self-Normalized Processes: An Application to Bandit Problems
The analysis of online least squares estimation is at the heart of many stochastic sequential decision-making problems. We employ tools from the self-normalized processes to provi...
Yasin Abbasi-Yadkori, Dávid Pál, Csa...
ATAL
2004
Springer
13 years 11 months ago
Interactive POMDPs: Properties and Preliminary Results
This paper presents properties and results of a new framework for sequential decision-making in multiagent settings called interactive partially observable Markov decision process...
Piotr J. Gmytrasiewicz, Prashant Doshi