In this letter we develop an expression for the approximate throughput guarantee violation probability (TGVP) for users in time-slotted networks for any scheduling algorithm with ...
In this paper we develop models for and analyze several randomized work stealing algorithms in a dynamic setting. Our models represent the limiting behavior of systems as the numb...
We discuss the problem of finding a good state representation in stochastic systems with observations. We develop a duality theory that generalizes existing work in predictive sta...
Christopher Hundt, Prakash Panangaden, Joelle Pine...
A class of trust-region algorithms is developed and analyzed for the solution of minimization problems with nonlinear inequality constraints. Based on composite-step trust region ...
Reinforcement learning problems are commonly tackled with temporal difference methods, which use dynamic programming and statistical sampling to estimate the long-term value of ta...