Search Sciweavers | Sciweavers

166 search results - page 2 / 34

» Online model learning in adversarial Markov decision process...

click to vote

EWRL
2008

129views Machine Learning» more EWRL 2008»

Markov Decision Processes with Arbitrary Reward Processes

13 years 6 months ago

Download www.cim.mcgill.ca

Abstract. We consider a control problem where the decision maker interacts with a standard Markov decision process with the exception that the reward functions vary arbitrarily ove...

Jia Yuan Yu, Shie Mannor, Nahum Shimkin

claim paper

Read More »

click to vote

NIPS
2007

149views Information Technology» more NIPS 2007»

Online Linear Regression and Its Application to Model-Based Reinforcement Learning

13 years 5 months ago

Download books.nips.cc

We provide a provably efﬁcient algorithm for learning Markov Decision Processes (MDPs) with continuous state and action spaces in the online setting. Speciﬁcally, we take a mo...

Alexander L. Strehl, Michael L. Littman

claim paper

Read More »

click to vote

ECML
2005
Springer

143views Machine Learning» more ECML 2005»

Active Learning in Partially Observable Markov Decision Processes

13 years 10 months ago

Download www.cs.mcgill.ca

This paper examines the problem of ﬁnding an optimal policy for a Partially Observable Markov Decision Process (POMDP) when the model is not known or is only poorly speciﬁed. W...

Robin Jaulmes, Joelle Pineau, Doina Precup

claim paper

Read More »

click to vote

AAAI
2011

246views Intelligent Agents» more AAAI 2011»

An Online Spectral Learning Algorithm for Partially Observable Nonlinear Dynamical Systems

12 years 4 months ago

Download www.cs.cmu.edu

Recently, a number of researchers have proposed spectral algorithms for learning models of dynamical systems—for example, Hidden Markov Models (HMMs), Partially Observable Marko...

Byron Boots, Geoffrey J. Gordon

claim paper

Read More »

click to vote

CDC
2009
IEEE

169views Control Systems» more CDC 2009»

Parametric regret in uncertain Markov decision processes

13 years 9 months ago

Download www.cim.mcgill.ca

— We consider decision making in a Markovian setup where the reward parameters are not known in advance. Our performance criterion is the gap between the performance of the best ...

Huan Xu, Shie Mannor

claim paper

Read More »

« Prev « First page 2 / 34 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers