Two-Timescale Algorithms for Learning Nash Equilibria in General-Sum Stochastic Games

10 years 25 days ago

Download www.aamas2015.com

We consider the problem of ﬁnding stationary Nash equilibria (NE) in a ﬁnite discounted general-sum stochastic game. We ﬁrst generalize a non-linear optimization problem from [9] to a general Nplayer game setting. Next, we break down the optimization problem into simpler sub-problems that ensure there is no Bellman error for a given state and an agent. We then provide a characterization of solution points of these sub-problems that correspond to Nash equilibria of the underlying game and for this purpose, we derive a set of necessary and sufﬁcient SG-SP (Stochastic Game Sub-Problem) conditions. Using these conditions, we develop two provably convergent algorithms. The ﬁrst algorithm - OFF-SGSP is centralized and model-based, i.e., it assumes complete information of the game. The second algorithm - ON-SGSP - is an online model-free algorithm. We establish that both algorithms converge, in self-play, to the equilibria of a certain ordinary differential equation (ODE), whose st...

H. L. Prasad, Prashanth L. A., Shalabh Bhatnagar

Real-time Traffic

ATAL 2015 | Intelligent Agents |

claim paper

» Multiagent reinforcement learning algorithm converging to Nash equilibrium in generalsum d...

» Multiagent Reinforcement Learning Theoretical Framework and an Algorithm

» BestResponse Multiagent Learning in NonStationary Environments

» The Dynamics of MultiAgent Reinforcement Learning

Post Info
More Details (n/a)

Added	16 Apr 2016
Updated	16 Apr 2016
Type	Journal
Year	2015
Where	ATAL
Authors	H. L. Prasad, Prashanth L. A., Shalabh Bhatnagar

Comments (0)

Sciweavers

Two-Timescale Algorithms for Learning Nash Equilibria in General-Sum Stochastic Games

ATAL 2015 | Intelligent Agents |

Explore & Download

Productivity Tools

Sciweavers