—In this paper we present APOS, a method for dynamically adapting the parameters of IEEE 802.11g to the estimated system state, with the aim of enhancing the quality of a voice c...
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
— Many decision systems rely on a precisely known Markov Chain model to guarantee optimal performance, and this paper considers the online estimation of unknown, nonstationary Ma...
Abstract—It is well known that for finite-sized networks, onestep retrieval in the autoassociative Willshaw net is a suboptimal way to extract the information stored in the syna...
Given any generative classifier based on an inexact density model, we can define a discriminative counterpart that reduces its asymptotic error rate, while increasing the estima...