Learning in mean-field oscillator games

12 years 11 months ago

Download mechse.illinois.edu

This research concerns a noncooperative dynamic game with large number of oscillators. The states are interpreted as the phase angles for a collection of non-homogeneous oscillators, and in this way the model may be regarded as an extension of the classical coupled oscillator model of Kuramoto. We introduce approximate dynamic programming (ADP) techniques for learning approximating optimal control laws for this model. Two types of parameterizations are considered, each of which is based on analysis of the deterministic PDE model introduced in our prior research. In an offline setting, a Galerkin procedure is introduced to choose the optimal parameters. In an online setting, a steepest descent stochastic approximation algorithm is proposed. We provide detailed analysis of the optimal parameter values as well as the Bellman error with both the Galerkin approximation and the online algorithm. Finally, a phase transition result is described for the large population limit when each oscillat...

Huibing Yin, Prashant G. Mehta, Sean P. Meyn, Uday

Real-time Traffic

CDC 2010 | Control Systems | Optimal Control Law | Optimal Parameter | Oscillators |

claim paper

» Reinforcement learning in extensive form games with incomplete information the bargaining ...

» Pathologies of temporal difference methods in approximate dynamic programming

Post Info
More Details (n/a)

Added	13 May 2011
Updated	13 May 2011
Type	Journal
Year	2010
Where	CDC
Authors	Huibing Yin, Prashant G. Mehta, Sean P. Meyn, Uday V. Shanbhag

Comments (0)

Sciweavers

Learning in mean-field oscillator games

CDC 2010 | Control Systems | Optimal Control Law | Optimal Parameter | Oscillators |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers