Sciweavers

47 search results - page 4 / 10
» Average-Reward Decentralized Markov Decision Processes
Sort
View
NIPS
2001
13 years 7 months ago
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
CORR
2011
Springer
175views Education» more  CORR 2011»
13 years 1 months ago
Adaptive Channel Recommendation for Dynamic Spectrum Access
—We propose a dynamic spectrum access scheme where secondary users recommend “good” channels to each other and access accordingly. We formulate the problem as an average rewa...
Xu Chen, Jianwei Huang, Husheng Li
KI
2007
Springer
13 years 6 months ago
Solving Decentralized Continuous Markov Decision Problems with Structured Reward
We present an approximation method that solves a class of Decentralized hybrid Markov Decision Processes (DEC-HMDPs). These DEC-HMDPs have both discrete and continuous state variab...
Emmanuel Benazera
QEST
2010
IEEE
13 years 4 months ago
Symblicit Calculation of Long-Run Averages for Concurrent Probabilistic Systems
Abstract--Model checkers for concurrent probabilistic systems have become very popular within the last decade. The study of long-run average behavior has however received only scan...
Ralf Wimmer, Bettina Braitling, Bernd Becker, Erns...
FSTTCS
2010
Springer
13 years 4 months ago
One-Counter Stochastic Games
We study the computational complexity of basic decision problems for one-counter simple stochastic games (OC-SSGs), under various objectives. OC-SSGs are 2-player turn-based stoch...
Tomás Brázdil, Václav Brozek,...