In this paper, we propose a policy gradient reinforcement learning algorithm to address transition-independent Dec-POMDPs. This approach aims at implicitly exploiting the locality...
Abstract. We study the decision theory of a maximally risk-averse investor — one whose objective, in the face of stochastic uncertainties, is to minimize the probability of ever ...
Noam Berger, Nevin Kapur, Leonard J. Schulman, Vij...
Abstract. We study two boosting algorithms, Coordinate Ascent Boosting and Approximate Coordinate Ascent Boosting, which are explicitly designed to produce maximum margins. To deri...
Cynthia Rudin, Robert E. Schapire, Ingrid Daubechi...
Standard no-internal-regret (NIR) algorithms compute a fixed point of a matrix, and hence typically require O(n3 ) run time per round of learning, where n is the dimensionality of...
—This work first presents a general technique to compute tight upper and lower bounds on the information rate of a multiuser Rayleigh fading channel with no Channel State Inform...
Krishnan Padmanabhan, Sundeep Venkatraman, Oliver ...