Sciweavers

87 search results - page 2 / 18
» A policy iteration algorithm for Markov decision processes s...
Sort
View
ISAAC
2010
Springer
243views Algorithms» more  ISAAC 2010»
13 years 3 months ago
Lower Bounds for Howard's Algorithm for Finding Minimum Mean-Cost Cycles
Howard's policy iteration algorithm is one of the most widely used algorithms for finding optimal policies for controlling Markov Decision Processes (MDPs). When applied to we...
Thomas Dueholm Hansen, Uri Zwick
ATAL
2009
Springer
13 years 11 months ago
Online exploration in least-squares policy iteration
One of the key problems in reinforcement learning is balancing exploration and exploitation. Another is learning and acting in large or even continuous Markov decision processes (...
Lihong Li, Michael L. Littman, Christopher R. Mans...
ISCC
2000
IEEE
104views Communications» more  ISCC 2000»
13 years 9 months ago
Dynamic Routing and Wavelength Assignment Using First Policy Iteration
With standard assumptions the routing and wavelength assignment problem (RWA) can be viewed as a Markov Decision Process (MDP). The problem, however, defies an exact solution bec...
Esa Hyytiä, Jorma T. Virtamo
ECAI
2010
Springer
13 years 6 months ago
On Finding Compromise Solutions in Multiobjective Markov Decision Processes
A Markov Decision Process (MDP) is a general model for solving planning problems under uncertainty. It has been extended to multiobjective MDP to address multicriteria or multiagen...
Patrice Perny, Paul Weng
CORR
2007
Springer
94views Education» more  CORR 2007»
13 years 5 months ago
Paging and Registration in Cellular Networks: Jointly Optimal Policies and an Iterative Algorithm
— This paper explores optimization of paging and registration policies in cellular networks. Motion is modeled as a discrete-time Markov process, and minimization of the discount...
Bruce Hajek, Kevin Mitzel, Sichao Yang