Sciweavers

166 search results - page 2 / 34
» Online model learning in adversarial Markov decision process...
Sort
View
EWRL
2008
13 years 6 months ago
Markov Decision Processes with Arbitrary Reward Processes
Abstract. We consider a control problem where the decision maker interacts with a standard Markov decision process with the exception that the reward functions vary arbitrarily ove...
Jia Yuan Yu, Shie Mannor, Nahum Shimkin
NIPS
2007
13 years 5 months ago
Online Linear Regression and Its Application to Model-Based Reinforcement Learning
We provide a provably efficient algorithm for learning Markov Decision Processes (MDPs) with continuous state and action spaces in the online setting. Specifically, we take a mo...
Alexander L. Strehl, Michael L. Littman
ECML
2005
Springer
13 years 10 months ago
Active Learning in Partially Observable Markov Decision Processes
This paper examines the problem of finding an optimal policy for a Partially Observable Markov Decision Process (POMDP) when the model is not known or is only poorly specified. W...
Robin Jaulmes, Joelle Pineau, Doina Precup
AAAI
2011
12 years 4 months ago
An Online Spectral Learning Algorithm for Partially Observable Nonlinear Dynamical Systems
Recently, a number of researchers have proposed spectral algorithms for learning models of dynamical systems—for example, Hidden Markov Models (HMMs), Partially Observable Marko...
Byron Boots, Geoffrey J. Gordon
CDC
2009
IEEE
169views Control Systems» more  CDC 2009»
13 years 9 months ago
Parametric regret in uncertain Markov decision processes
— We consider decision making in a Markovian setup where the reward parameters are not known in advance. Our performance criterion is the gap between the performance of the best ...
Huan Xu, Shie Mannor