We study rerouting policies in a dynamic round-based variant of a well known game theoretic traffic model due to Wardrop. Previous analyses (mostly in the context of selfish routi...
Stochastic games generalize Markov decision processes MDPs to a multiagent setting by allowing the state transitions to depend jointly on all player actions, and having rewards de...
Michael J. Kearns, Yishay Mansour, Satinder P. Sin...
We investigate the behaviour of load-adaptive rerouting policies in the Wardrop model where decisions must be made on the basis of stale information. In this model, an infinite n...
Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...
Recently, an optimization approach for fast visual tracking of articulated structures based on Stochastic Meta-Descent (SMD) [7] has been presented. SMD is a gradient descent with...
Matthieu Bray, Esther Koller-Meier, Nicol N. Schra...