Sciweavers

106
Voted
NIPS
2004
15 years 2 days ago
Experts in a Markov Decision Process
We consider an MDP setting in which the reward function is allowed to change during each time step of play (possibly in an adversarial manner), yet the dynamics remain fixed. Simi...
Eyal Even-Dar, Sham M. Kakade, Yishay Mansour