We consider a variant of the classic multi-armed bandit problem (MAB), which we call FEEDBACK MAB, where the reward obtained by playing each of n independent arms varies according...
We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available afte...
The recovery of signal parameters from noisy sampled data is a fundamental problem in digital signal processing. In this paper, we consider the following spectral analysis problem...
Given a certain function f, various methods have been proposed in the past for addressing the important problem of computing the matrix-vector product f(A)b without explicitly comp...
In the Stochastic Orienteering problem, we are given a metric, where each node also has a job located there with some deterministic reward and a random size. (Think of the jobs as...