Regret Minimization Under Partial Monitoring

13 years 4 months ago

Download eprints.pascal-network.org

We consider repeated games in which the player, instead of observing the action chosen by the opponent in each game round, receives a feedback generated by the combined choice of the two players. We study Hannan consistent players for this games; that is, randomized playing strategies whose per-round regret vanishes with probability one as the number n of game rounds goes to infinity. We prove a general lower bound of (n-1/3) on the convergence rate of the regret, and exhibit a specific strategy that attains this rate on any game for which a Hannan consistent player exists. The first two authors acknowledge support by the PASCAL Network of Excellence under EC grant no. 506778. The work of the second author was supported by the Spanish Ministry of Science and Technology and FEDER, grant BMF2003-03324. Part of this work was done wile the third co-author was visiting Pompeu Fabra University. 1

Nicolò Cesa-Bianchi, Gábor Lugosi, G

Real-time Traffic