Many statistical M-estimators are based on convex optimization problems formed by the weighted sum of a loss function with a norm-based regularizer. We analyze the convergence rat...
Alekh Agarwal, Sahand Negahban, Martin J. Wainwrig...
We present a fully four-dimensional, globally convergent, incremental gradient algorithm to estimate the continuous-time tracer density from list mode positron emission tomography...
Abstract--An Elman network (EN) can be viewed as a feedforward (FF) neural network with an additional set of inputs from the context layer (feedback from the hidden layer). Therefo...
We model reinforcement learning as the problem of learning to control a Partially Observable Markov Decision Process ( ¢¡¤£¦¥§ ), and focus on gradient ascent approache...
Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...