This paper considers online stochastic optimization problems where time constraints severely limit the number of offline optimizations which can be performed at decision time and/...
The Maximum Differential Backlog (MDB) control policy of Tassiulas and Ephremides has been shown to adaptively maximize the stable throughput of multihop wireless networks with ran...
A simulation-based optimization framework involving simultaneous perturbation stochastic approximation (SPSA) is presented as a means for optimally specifying parameters of intern...
It is known that the complexity of the reinforcement learning algorithms, such as Q-learning, may be exponential in the number of environment’s states. It was shown, however, th...
We propose a convex-concave programming approach for the labeled weighted graph matching problem. The convex-concave programming formulation is obtained by rewriting the weighted ...
Mikhail Zaslavskiy, Francis Bach, Jean-Philippe Ve...