Sciweavers

COLT
2004
Springer

Reinforcement Learning for Average Reward Zero-Sum Games

13 years 10 months ago
Reinforcement Learning for Average Reward Zero-Sum Games
Abstract. We consider Reinforcement Learning for average reward zerosum stochastic games. We present and analyze two algorithms. The first is based on relative Q-learning and the second on Q-learning for stochastic shortest path games. Convergence is proved using the ODE (Ordinary Differential Equation) method. We further discuss the case where not all the actions are played by the opponent with comparable frequencies and present an algorithm that converges to the optimal Q-function, given the observed play of the opponent.
Shie Mannor
Added 01 Jul 2010
Updated 01 Jul 2010
Type Conference
Year 2004
Where COLT
Authors Shie Mannor
Comments (0)