Sciweavers

52 search results - page 4 / 11
» Error Bounds for Approximate Policy Iteration
Sort
View
TIT
2010
115views Education» more  TIT 2010»
14 years 4 months ago
On resource allocation in fading multiple-access channels-an efficient approximate projection approach
We consider the problem of rate and power allocation in a multiple-access channel. Our objective is to obtain rate and power allocation policies that maximize a general concave ut...
Ali ParandehGheibi, Atilla Eryilmaz, Asuman E. Ozd...
ICML
2007
IEEE
15 years 10 months ago
Multi-armed bandit problems with dependent arms
We provide a framework to exploit dependencies among arms in multi-armed bandit problems, when the dependencies are in the form of a generative model on clusters of arms. We find ...
Sandeep Pandey, Deepayan Chakrabarti, Deepak Agarw...
JMLR
2006
143views more  JMLR 2006»
14 years 9 months ago
Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation
We study a sequential variance reduction technique for Monte Carlo estimation of functionals in Markov Chains. The method is based on designing sequential control variates using s...
Rémi Munos
SIAMJO
2008
93views more  SIAMJO 2008»
14 years 9 months ago
Smooth Optimization with Approximate Gradient
We show that the optimal complexity of Nesterov's smooth first-order optimization algorithm is preserved when the gradient is only computed up to a small, uniformly bounded er...
Alexandre d'Aspremont
AIPS
2010
14 years 12 months ago
When Policies Can Be Trusted: Analyzing a Criteria to Identify Optimal Policies in MDPs with Unknown Model Parameters
Computing a good policy in stochastic uncertain environments with unknown dynamics and reward model parameters is a challenging task. In a number of domains, ranging from space ro...
Emma Brunskill