Sciweavers

656 search results - page 76 / 132
» Complexity of finite-horizon Markov decision process problem...
Sort
View
ICN
2007
Springer
15 years 8 months ago
Heuristic Approach of Optimal Code Allocation in High Speed Downlink Packet Access Networks
— In this paper, we use the Markov Decision Process (MDP) technique to find the optimal code allocation policy in High-Speed Downlink Packet Access (HSDPA) networks. A discrete ...
Hussein Al-Zubaidy, Jerome Talim, Ioannis Lambadar...
EWRL
2008
15 years 3 months ago
Efficient Reinforcement Learning in Parameterized Models: Discrete Parameter Case
We consider reinforcement learning in the parameterized setup, where the model is known to belong to a parameterized family of Markov Decision Processes (MDPs). We further impose ...
Kirill Dyagilev, Shie Mannor, Nahum Shimkin
UAI
2004
15 years 3 months ago
Region-Based Incremental Pruning for POMDPs
We present a major improvement to the incremental pruning algorithm for solving partially observable Markov decision processes. Our technique targets the cross-sum step of the dyn...
Zhengzhu Feng, Shlomo Zilberstein
ISAAC
2010
Springer
243views Algorithms» more  ISAAC 2010»
14 years 12 months ago
Lower Bounds for Howard's Algorithm for Finding Minimum Mean-Cost Cycles
Howard's policy iteration algorithm is one of the most widely used algorithms for finding optimal policies for controlling Markov Decision Processes (MDPs). When applied to we...
Thomas Dueholm Hansen, Uri Zwick
JMLR
2010
189views more  JMLR 2010»
14 years 8 months ago
Adaptive Step-size Policy Gradients with Average Reward Metric
In this paper, we propose a novel adaptive step-size approach for policy gradient reinforcement learning. A new metric is defined for policy gradients that measures the effect of ...
Takamitsu Matsubara, Tetsuro Morimura, Jun Morimot...