Sciweavers

ATAL
2015
Springer

Two-Timescale Algorithms for Learning Nash Equilibria in General-Sum Stochastic Games

8 years 4 days ago
Two-Timescale Algorithms for Learning Nash Equilibria in General-Sum Stochastic Games
We consider the problem of finding stationary Nash equilibria (NE) in a finite discounted general-sum stochastic game. We first generalize a non-linear optimization problem from [9] to a general Nplayer game setting. Next, we break down the optimization problem into simpler sub-problems that ensure there is no Bellman error for a given state and an agent. We then provide a characterization of solution points of these sub-problems that correspond to Nash equilibria of the underlying game and for this purpose, we derive a set of necessary and sufficient SG-SP (Stochastic Game Sub-Problem) conditions. Using these conditions, we develop two provably convergent algorithms. The first algorithm - OFF-SGSP is centralized and model-based, i.e., it assumes complete information of the game. The second algorithm - ON-SGSP - is an online model-free algorithm. We establish that both algorithms converge, in self-play, to the equilibria of a certain ordinary differential equation (ODE), whose st...
H. L. Prasad, Prashanth L. A., Shalabh Bhatnagar
Added 16 Apr 2016
Updated 16 Apr 2016
Type Journal
Year 2015
Where ATAL
Authors H. L. Prasad, Prashanth L. A., Shalabh Bhatnagar
Comments (0)