Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

170

NIPS
2004

103views Information Technology» more NIPS 2004»

Experts in a Markov Decision Process

15 years 7 months ago

Experts in a Markov Decision Process

Download books.nips.cc

We consider an MDP setting in which the reward function is allowed to change during each time step of play (possibly in an adversarial manner), yet the dynamics remain fixed. Similar to the experts setting, we address the question of how well can an agent do when compared to the reward achieved under the best stationary policy over time. We provide efficient algorithms, which have regret bounds with no dependence on the size of state space. Instead, these bounds depend only on a certain horizon time of the process and logarithmically on the number of actions. We also show that in the case that the dynamics change over time, the problem becomes computationally hard.

Eyal Even-Dar, Sham M. Kakade, Yishay Mansour

Real-time Traffic

MDP Setting | NIPS 2004 | NIPS 2007 | Reward Function | Time Step |

claim paper

Related Content

» ValueDirected Human Behavior Analysis from Video Using Partially Observable Markov Decisio...

» An Experts Algorithm for Transfer Learning

» Policy Gradient Planning for Environmental Decision Making with Existing Simulators

» Factored MDP Elicitation and Plan Display

» Markov Decision Petri Net and Markov Decision WellFormed Net Formalisms

» Exact finite approximations of averagecost countable Markov decision processes

» An epsilonOptimal GridBased Algorithm for Partially Observable Markov Decision Processes

» Apprenticeship learning via soft local homomorphisms

» Statistic Analysis for Probabilistic Processes

Post Info
More Details (n/a)

Added	31 Oct 2010
Updated	31 Oct 2010
Type	Conference
Year	2004
Where	NIPS
Authors	Eyal Even-Dar, Sham M. Kakade, Yishay Mansour

Comments (0)