Sciweavers

52 search results - page 6 / 11
» Error Bounds for Approximate Policy Iteration
Sort
View
CDC
2009
IEEE
123views Control Systems» more  CDC 2009»
15 years 2 months ago
On the myopic policy for a class of restless bandit problems with applications in dynamic multichannel access
We consider a class of restless multi-armed bandit problems that arises in multi-channel opportunistic communications, where channels are modeled as independent and stochastically...
Keqin Liu, Qing Zhao
CORR
2008
Springer
132views Education» more  CORR 2008»
14 years 9 months ago
Dynamic Rate Allocation in Fading Multiple-access Channels
We consider the problem of rate allocation in a fading Gaussian multiple-access channel (MAC) with fixed transmission powers. Our goal is to maximize a general concave utility func...
Ali ParandehGheibi, Atilla Eryilmaz, Asuman E. Ozd...
ICMLA
2010
14 years 7 months ago
Ensembles of Neural Networks for Robust Reinforcement Learning
Reinforcement learning algorithms that employ neural networks as function approximators have proven to be powerful tools for solving optimal control problems. However, their traini...
Alexander Hans, Steffen Udluft
CDC
2008
IEEE
120views Control Systems» more  CDC 2008»
15 years 4 months ago
Approximate abstractions of discrete-time controlled stochastic hybrid systems
ate Abstractions of Discrete-Time Controlled Stochastic Hybrid Systems Alessandro D’Innocenzo, Alessandro Abate, and Maria D. Di Benedetto — This work proposes a procedure to c...
Alessandro D'Innocenzo, Alessandro Abate, Maria Do...
MP
2002
176views more  MP 2002»
14 years 9 months ago
UOBYQA: unconstrained optimization by quadratic approximation
UOBYQA is a new algorithm for general unconstrained optimization calculations, that takes account of the curvature of the objective function, F say, by forming quadratic models by ...
M. J. D. Powell