Sciweavers

656 search results - page 35 / 132
» Complexity of finite-horizon Markov decision process problem...
Sort
View
AAAI
2004
15 years 3 months ago
Stochastic Local Search for POMDP Controllers
The search for finite-state controllers for partially observable Markov decision processes (POMDPs) is often based on approaches like gradient ascent, attractive because of their ...
Darius Braziunas, Craig Boutilier
CORR
2006
Springer
113views Education» more  CORR 2006»
15 years 1 months ago
A Unified View of TD Algorithms; Introducing Full-Gradient TD and Equi-Gradient Descent TD
This paper addresses the issue of policy evaluation in Markov Decision Processes, using linear function approximation. It provides a unified view of algorithms such as TD(), LSTD()...
Manuel Loth, Philippe Preux
114
Voted
IPCO
2008
114views Optimization» more  IPCO 2008»
15 years 3 months ago
The Stochastic Machine Replenishment Problem
We study the stochastic machine replenishment problem, which is a canonical special case of closed multiclass queuing systems in Markov decision theory. The problem models the sche...
Kamesh Munagala, Peng Shi
ICASSP
2011
IEEE
14 years 5 months ago
Reinforcement learning for energy-efficient wireless transmission
We consider the problem of energy-efficient point-to-point transmission of delay-sensitive data (e.g. multimedia data) over a fading channel. We propose a rigorous and unified fra...
Nicholas Mastronarde, Mihaela van der Schaar
ICML
2001
IEEE
16 years 2 months ago
Continuous-Time Hierarchical Reinforcement Learning
Hierarchical reinforcement learning (RL) is a general framework which studies how to exploit the structure of actions and tasks to accelerate policy learning in large domains. Pri...
Mohammad Ghavamzadeh, Sridhar Mahadevan