Sciweavers

162 search results - page 13 / 33
» Topological Value Iteration Algorithm for Markov Decision Pr...
Sort
View
ISAAC
2010
Springer
243views Algorithms» more  ISAAC 2010»
14 years 7 months ago
Lower Bounds for Howard's Algorithm for Finding Minimum Mean-Cost Cycles
Howard's policy iteration algorithm is one of the most widely used algorithms for finding optimal policies for controlling Markov Decision Processes (MDPs). When applied to we...
Thomas Dueholm Hansen, Uri Zwick
UAI
2000
14 years 11 months ago
Value-Directed Belief State Approximation for POMDPs
We consider the problem belief-state monitoring for the purposes of implementing a policy for a partially-observable Markov decision process (POMDP), specifically how one might ap...
Pascal Poupart, Craig Boutilier
DATE
2008
IEEE
136views Hardware» more  DATE 2008»
15 years 4 months ago
A Framework of Stochastic Power Management Using Hidden Markov Model
- The effectiveness of stochastic power management relies on the accurate system and workload model and effective policy optimization. Workload modeling is a machine learning proce...
Ying Tan, Qinru Qiu
SIGMETRICS
2000
ACM
105views Hardware» more  SIGMETRICS 2000»
15 years 2 months ago
Using the exact state space of a Markov model to compute approximate stationary measures
We present a new approximation algorithm based on an exact representation of the state space S, using decision diagrams, and of the transition rate matrix R, using Kronecker algeb...
Andrew S. Miner, Gianfranco Ciardo, Susanna Donate...
ICML
1999
IEEE
15 years 10 months ago
Least-Squares Temporal Difference Learning
Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...
Justin A. Boyan